Difference between revisions of "RAL Tier1 weekly operations castor 18/07/2011"
From GridPP Wiki
Matt viljoen (Talk | contribs) |
(No difference)
|
Latest revision as of 08:09, 20 July 2011
Contents
Operations News
- 7 SL08 disk servers (140TB) deployed to lhcbRawRdst
- All remaining tape-based service classes GC policy now Last Recently Used (LRU)
- FT, CT and DB Team have decided to use Amanda as a means of backing up the new Tier1 databases to tape.
Operations Problems
- Gen SRM problems on night of Wed/Thu
- On Thu PM ATLAS Stager database got into an inconsistent state (a sub-request without an entry in the id2type table) which caused approx. 2 hours of unscheduled downtime. Cause unknown.
Blocking Issues
none
Planned, Scheduled and Cancelled Interventions
Entries in/planned to go to GOCDB none
Advanced Planning
- Move Tier1 instances to new Database infrastructure which with a Dataguard backup instance in R26
- Move Facilities DB instance to new Database hardware running 10g
- Upgrade SRMs to 2.11 which incorporates VOMS support
- Start migrating from T10KA to T10KC media later this year
- Certify 2.1.11 and evaluate the new LSF replacement
- Quattorization of remaining SRM servers
- Hardware upgrade, Quattorization and Upgrade to SL5 of Tier1 CASTOR headnodes
Staffing
- Castor on Call person: Matthew
- Staff absence/out of the office:
- (Mon-Wed) Shaun on training