Difference between revisions of "RAL Tier1 weekly operations castor 11/03/2013"
From GridPP Wiki
Matt viljoen (Talk | contribs) |
(No difference)
|
Latest revision as of 10:56, 8 March 2013
Contents
Operations News
- Tape verification server (lcgcadm01) now up and running against the Tier 1 CASTOR instances
- V09 generation of WNs upgraded to 2.1.13-9 CASTOR client
- rsyslog TCP logging now begin sent from all headnodes, including non-rsyslog compliment daemons (e.g. nsd, xrd)
- fetch-crl now running every 6 hours on all SRMs
Operations Problems
- (Sun-Tue) Disk manager bug which stopped all transfers on gdss590 for 2.5 days. Only error in dismanager log was: "Detected stuck ActivityControl thread. Killed connections to transfermanagerd". We haven't seen this error before, but will look out for it in the future.
- Still DB problems "ORA-32108: max column or parameter size not specified" when testing 2.1.13-9. Have tried instantoracle clients: 11.2.0.3.0-1, 11.2.0.2.0 and 10.2.0.3-4
Blocking Issues
- Can't upgrade puppet until someone spends time learning about administering it (to replace Chris) and this may delay an SL6 upgrade
Planned, Scheduled and Cancelled Interventions
Entries in/planned to go to GOCDB none
Advanced Planning
Tasks
- Test and certify 2.1.13-9 with simplified Quattor templates
- Turn off Amanda backups
Interventions
- Upgrade tape servers to 2.1.13-9
- Upgrade central services (NS,CUPV,VDQM) from 2.1.11-9 to 2.1.13-9
- Upgrade stagers from 2.1.12 to 2.1.13
Staffing
- Castor on Call person
- Rob
- Staff absence/out of the office:
- Shaun at EUDAT and ISGC (all week)
- Jens at EUDAT (Mon-Tue) and A/L (Fri)