From GridPP Wiki
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Summary of Previous Week
Developments
- Alastair
- Perform Security Audit
- Learn how to deploy disk servers for ATLAS
- Discuss Job Plan with Matt
- Discuss allocation of ATLAS disk space with Brian Davies and Stephen Burke
- Go to Shared Service training
- Andrew
- FTS channel adjustments: timeouts doubled for STAR-FIHIPT2 & RALLCG2-CLOUDCMSITALY
- Disk server deployment (5 servers to cmsFarmRead)
- APEL & PBS comparisons for CREAM CE
- Correcting PBS jobs MySQL table for October
- Resolved problem with PhEDEx mss-remove agent
- Upgraded PhEDEx to 3.2.9
- Completed CMS "dark" data removal
- Investigating consistency between missing files lists from PhEDEx & CASTOR team
- Catalin
- CRISTAL 1 course
- finished kickstarts for FronTier and SL5 VOBOX and waited for HW
- assisted the LFC ATLAS cleaning operation
- disk servers deployment for ALICE
- Derek
- Updating vo configuration in quattor
- Testing helpdesk backup
- Cristal level 1
- SSC Training
- Out sick 1 day
- Matt
- Determine LHCb service class requirements for new allocation
- Disk deployment meeting
- Richard
- ORACLE SSC Training
- Further disk server deployments into Atlas NonProd (including updates to the TWiki instructions)
- Continued work on BDII/Quattor task
- CASTOR activities: Read through SDW's training slides; work on new pre-prod instance
- Mayo
- Worked on the new Metrics Gathering System
- Thought Bubble website now in operation
- Initial research into IPMI power control project
Operational Issues and Incidents
Description
|
Start
|
End
|
Affected VO(s)
|
Severity
|
Status
|
Plans for Week(s) Ahead
Plans
- Alastair
- Finish security audit (if not already finished)
- go through gLite training
- go through castor training slides
- learn about FTS and outputs that I will take over from Brian
- Update CPU efficiencies
- Andrew
- Attend CMS Offline & Computing Workshop, CERN
- Catalin
- ready to deploy SL5 VOBOX for Alice (waiting for HW)
- ready to deploy FronTier/squid for ATLAS (waiting for HW)
- finish Alice disk servers deployment
- start WMS03 drain
- Derek
- Test helpdesk restore
- Updating quattor vo configuration
- Update CE documentation
- Matt
- Check priorities for deploying Viglen 08 kit after it passes acceptance tests
- VO requirements capture
- Disaster recovery planning
- Richard
- RPM packaging and installation for new BDII connection throttling script
- RPM packaging and installation for new BDII monitoring script
- Complete quattor config/build for BDII servers
- CASTOR activities: Continue work on new pre-prod instance
- Mayo
- Continued work on New Metric Gathering System
- Begin Stage 2 of on call documentation project
- Continue research into IPMI power control project
Resource Requests
Downtimes
Description
|
Hosts
|
Type
|
Start
|
End
|
Affected VO(s)
|
WMS03 hotswappable
|
lcgwms03.gridpp.rl.ac.uk
|
Scheduled Outage
|
Oct 30 (09:00)
|
Nov 05 (16:00)
|
non-LHC
|
Requirements and Blocking Issues
Description
|
Required By
|
Priority
|
Status
|
HW for Squid deployment
|
ATLAS
|
High
|
request made via RT Fabric queue
|
HW for FronTier deployment
|
ATLAS
|
High
|
request made via RT Fabric queue
|
HW for SL5 64-bit VOBOX
|
Alice
|
High
|
request made via RT Fabric queue
|
Hardware for testing LFC/FTS resilience
|
|
High
|
DataServices want to deploy a DataGuard configuration to test LFC/FTS resilience; request for HW made through RT Fabric queue
|
Non-capacity HW for testing
|
|
Medium
|
Still using the old HW
|
Hardware for PPS
|
|
Medium
|
We have made a commitment to test PPS pre-releases, and have no hardware dedicated for this.
|
OnCall/AoD Cover
- Primary OnCall: Catalin (Mon-Thu)
- Grid OnCall:
- AoD: