RAL Tier1 weekly operations castor 10/06/2013
From GridPP Wiki
- Old 2.1.12 DLS now working against 2.1.13 Facilities
- Testing has confirmed that with the current updated version of FTS, files from disk servers in draining no longer cause access problems.
- May errata + kernel rolled out to all test systems.
- (Tue) New CMS workflow started Wait I/O contention on cmsDisk disk servers. Reducing total slot count + increasing xrootd weighting brought instance down for ~2 hours on Wednesday, resulting in a callout. Transfer manager changes were reversed and CMS load on batch farm reduced with brought CMS back, but it was only until lazy download was turned on again on Thursday that the problem went away.
- Can't upgrade puppet until someone spends time learning about administering it (to replace Chris) and this may delay an SL6 upgrade
Planned, Scheduled and Cancelled Interventions
Entries in/planned to go to GOCDB none
- Upgrade central services (NS,CUPV,VDQM) from 2.1.11-9 to 2.1.13-9
- Upgrade stagers from 2.1.12 to 2.1.13
- Castor on Call person
- Staff absence/out of the office:
- (Mon-Wed) Matthew at SDB users group meeting
- (Mon-Tue) Shaun at EUDAT meeting