RAL Tier1 weekly operations castor 11/02/2013
From GridPP Wiki
Revision as of 15:24, 11 February 2013 by Matt viljoen (Talk | contribs)
Contents
Operations News
- Preprod instance now functioning again, and test tape server (tcastor200) now upgraded to 2.1.13.
- After successfully testing 2.1.13 tape server, we have upgraded the first production tape server to 2.1.13 (lcgcts22)
- 2.1.13-7 now released and we are advised by CERN to upgrade to this version.
- Upgraded test systems to Jan errata and kernel
Operations Problems
- A known bug of obfuscated VO name has re-appeared in the ATLAS SRMs. This was last seen in April 2012. https://savannah.cern.ch/bugs/index.php?91389 The developers are restarting investigation, which appears to be one of memory corruption introduced by the SRM code.
- aliceDisk is full. The VO has been told.
- Disk server draining continuing for ATLAS very slowly.
Blocking Issues
- Can't upgrade puppet until someone spends time learning about administering it (to replace Chris) and this may delay an SL6 upgrade
Planned, Scheduled and Cancelled Interventions
Entries in/planned to go to GOCDB none
Advanced Planning
Tasks
- Test and certify 2.1.13-7 with simplified Quattor templates
- Turn off Amanda backups
Interventions
- Upgrade tape servers to 2.1.13-7
- Upgrade central services (NS,CUPV,VDQM) from 2.1.11-9 to 2.1.13-7
- Upgrade stagers from 2.1.12 to 2.1.13
Staffing
- Castor on Call person
- Matthew
- Staff absence/out of the office:
- Matthew (Tue-Thu) A/L
- Shaun (all week) A/L
- Rob (Fri) A/L