RAL Tier1 weekly operations castor 19/09/2011
From GridPP Wiki
Contents
Operations News
- none
Operations Problems
- On Wednesday service for ATLAS was degraded due to a misconfiguration of read only disk servers. Documentation will be updated to prevent this happening again.
Blocking Issues
- We need to understand the cause of the new database disk array hardware problem before we can migrate production databases over to it.
Planned, Scheduled and Cancelled Interventions
Entries in/planned to go to GOCDB none
Advanced Planning
- Move Tier1 instances to new Database infrastructure which with a Dataguard backup instance in R26
- Move Facilities DB instance to new Database hardware running 10g
- Upgrade SRMs to 2.11 which incorporates VOMS support
- Start migrating from T10KA to T10KC media later this year
- Certify 2.1.11 and evaluate the new LSF replacement
- Quattorization of remaining SRM servers
- Hardware upgrade, Quattorization and Upgrade to SL5 of Tier1 CASTOR headnodes
Staffing
- Castor on Call person: Matthew
- Staff absence/out of the office:
- Matthew on TOIL Wednesday afternoon