Difference between revisions of "RAL Tier1 weekly operations castor 26/10/2009"
From GridPP Wiki
Matt viljoen (Talk | contribs) |
(No difference)
|
Latest revision as of 15:40, 2 November 2009
Contents
Summary of Previous Week
- Building Quattor templates for preprod (Richard)
- Deployment+draining training for Alistair (Brian)
- Established locations for ATLAS disk deployment (Brian)
- SRM debugging of disk copy problem - fix in 2.1.8-2 (Shaun)
- Developed DB fix to allow checksumming to work on 2.1.7 (Shaun)
- Deployed new disk servers for LHCb,ATLAS,CMS (Chris)
- Deployed disk servers to nonprod (Tiju,Richard,Alistair,AndrewL)
- Setting up repack (Chris)
- Attending LTUG (Tim)
- All tape drives now up and running (Tim)
- Testing various combinations of EMC kit versus power supply (Cheney)
- Regen nagios config for diskservers (Cheney)
- Build spare tape robot controller (Cheney)
- Build replacement DB server (Cheney)
- Fixed Nagios callout problem (Cheney)
- CASTOR/Fabric work transfer proposal (All)
- Wrote script to bump up of unique file IDs of files with reused IDs (Matthew)
- Making ATLAS file lists for comparison to LFC (Matthew)
- Disaster Management of recent data-loss (Matthew)
Developments for this week
- SRM 2.8-2 deployment on all instances (Shaun)
- Working on puppet manifest for polymorphic central servers (Chris)
- Setup 2.1.9 on repack server (Chris, Tim)
- Testing Quattor templates on preprod servers (Richard)
- Techwatch newsletter (Cheney)
- Chasing up strategic objectives (Matthew)
- Reviewing preprod plans (Matthew)
- Disaster Management of recent data-loss (Matthew)
- Deploying 3 new disk servers for repack server (Matthew, Shaun)
Ongoing
- Improving resilience on central servers (Chris, Shaun)
- CastorMon monitoring graphs for Gen instance (Brian)
- Disaster recovery document (Matthew)
Operations Issues
none
Blocking issues
none
Planned, Scheduled and Cancelled Down Times
Entries in/planned to go to GOCDB
Description | Start | End | Type | Affected VO(s) |
---|---|---|---|---|
Upgrade SRM to 2.8-2 | 26/10/09 1000 | 26/10/09 1200 | At Risk | ATLAS and LHCb |
Upgrade SRM to 2.8-2 | 27/10/09 1000 | 27/10/09 1200 | At Risk | CMS and Gen |
Changes to Production Milestones
none
Advanced Planning
- Black and White lists will be tested and introduced on ATLAS
- Install/enable gridftp-internal on Gen (This year)
Staffing
- Cheney away (Thurs)
- Castor on Call person: Shaun