Difference between revisions of "RAL Tier1 weekly operations castor 04/07/2011"
From GridPP Wiki
Matt viljoen (Talk | contribs) |
(No difference)
|
Latest revision as of 10:11, 1 July 2011
Contents
Operations News
- All tape servers now upgraded to 2.1.10-0
- All worker nodes now upgraded to CASTOR Client 2.1.10-0
- xrootd successfully tested with CMS
- xrootd testing started with LHCb
- New production class hardware runnign ORACLE 10g is now ready for Facilities, apart from the backup mechanism
Operations Problems
- There were a number of problems on Monday which we believe followed on from the Neptune rack failure over the weekend: tapes left in an inconsistent status and CMS LSF scheduler failing many jobs. For the latter, restarting all services on the LSF machine fixed the problem.
Blocking Issues
none
Planned, Scheduled and Cancelled Interventions
Entries in/planned to go to GOCDB
Description | Start | End | Type | Affected VO(s) |
---|---|---|---|---|
CASTOR 2.1.10-1 upgrade | 05 July 09:00 | 05 July 11:00 | Downtime | All |
Advanced Planning
- Move Tier1 instances to new Database infrastructure which with a Dataguard backup instance in R26
- Move Facilities DB instance to new Database hardware running 10g
- Upgrade SRMs to 2.11 which incorporates VOMS support
- Start migrating from T10KA to T10KC media later this year
- Quattorization of remaining SRM servers
- Hardware upgrade, Quattorization and Upgrade to SL5 of Tier1 CASTOR headnodes
Staffing
- Castor on Call person: Chris
- Staff absence/out of the office:
- (Mon) Shaun A/L
- (Wed) Matthew working from home AM, A/L PM
- (Thu) Matthew A/L PM