RAL Tier1 weekly operations castor 26/12/2010
From GridPP Wiki
Revision as of 14:56, 22 December 2010 by Matt viljoen (Talk | contribs)
Contents
Operations News
- Secondary job managers installed on LSF machines of remaining instances (LHCb, Gen) which will guard us against the intermittent bug when the JM stops processing requests for no reason.
Operations Issues
- ..
Blocking issues
- Lack of production-class hardware running ORACLE 10g needs to be resolved prior to CASTOR for Facilities going into full production
Planned, Scheduled and Cancelled Interventions
Entries in/planned to go to GOCDB
Description | Start | End | Type | Affected VO(s) | Lead by |
---|---|---|---|---|---|
Update ATLAS disk servers to SL5 64bit | 17/01/2011 08:00 | 18/12/2011 16:00 | Downtime | ATLAS | MV |
Advanced Planning
- CASTOR for Facilities instance in production by end of 2010
- Upgrade ATLAS, CMS, Gen disk servers to SL5 64bit and Quattorize the non-Quattorized disk servers
- CASTOR certification and upgrade to 2.1.9-10 which incorporates the fix for gridftp-internal to support multiple service classes, enabling checksums for Gen
- CASTOR upgrade to 2.1.9-10 and SRM upgrade to 2.10 to fix the unavailable status being reported to FTS with draining disk servers
Staffing
- Castor on Call person: Chris
- Staff absence/out of the office:
- (Christmas holiday - cover from home only)