RAL Tier1 weekly operations castor 30/01/2012
From GridPP Wiki
Contents
Operations News
- CMS and Gen successfully upgraded to SRM 2.11
- Stress testing with the Transfer Manager has uncovered a number of problems that require further investigation. We have decided not to use the TM immediately after upgrading.
Operations Problems
- LHCb own tests caused a high number of failures within LSF on Wednesday, that possibly adversely impacted end users.
Blocking Issues
- none
Planned, Scheduled and Cancelled Interventions
Entries in/planned to go to GOCDB
Description | Start | End | Type | Affected VO(s) | Lead by |
---|---|---|---|---|---|
SRM 2.11 upgrade, inc. move to new hardware+SL5+Quattor | 30/01/2012 10:00 | 30/01/2012 12:00 | Downtime | ATLAS | Shaun |
SRM 2.11 upgrade, inc. move to new hardware+SL5+Quattor | 02/02/2012 10:00 | 02/02/2012 12:00 | Downtime | LHCb | Shaun |
CIP 2.2.0 upgrade (STC) | 02/02/2012 12:00 | 22/02/2012 15:00 | At-risk | All | Matthew |
Stage 2 of CASTOR DB move (STC) | 07/02/2012 08:00 | 07/02/2012 16:00 | Downtime | All | Rich |
CASTOR 2.11-8 upgrade, inc. move to new hardware+SL5+Quattor (STC) | 13/02/2012 08:00 | 24/02/2012 16:00 | Downtime | All | Matthew |
Advanced Planning
- Move Tier1 instances to new Database infrastructure which with a Dataguard backup instance in R26
- Switch from LSF to Transfer Manager after 2.1.11 upgrade
- Start using Tape Gateway once CERN have been using it in production for approx. 2 months.
Staffing
- Castor on Call person: Matthew
- Staff absence/out of the office:
- (Mon-Wed) Chris at Contrail conference