Difference between revisions of "RAL Tier1 weekly operations castor 11/11/2013"
From GridPP Wiki
Matt viljoen (Talk | contribs) |
(No difference)
|
Latest revision as of 15:31, 8 November 2013
Contents
Operations News
- Successful UPS Essential Work intervention for CASTOR and other services
Operations Problems
- Grid-mapfile had been found to be outdated in CMS disk servers. We took this opportunity to finally move grid-mapfile generation to the new castor admin box (lcgccvm02), and make grid-mapfile propogation consistent by adopting Quattor for this purpose for disk servers.
- The transfermanagerd on the ATLAS LSF stopped with no logging again just before the intervention.
- After UPS intervention, CMS and LHCb stager db schemas had relocated from their preferred node to one node (plutor891) which caused the transfermanagers on these instances to function in a degraded manner with lots of database connection timeouts (ORA-12520: TNS:listener could not find available handler for requested type of Server ")
- One ILC file found to be corrupted with different physical checksum than the Network+NS checksum. The VO has been notificed.
Blocking Issues
- none
Planned, Scheduled and Cancelled Interventions
Entries in/planned to go to GOCDB
- none
Advanced Planning
Tasks
- CASTOR 2.1.14 + SL5/6 testing
Interventions
- none
Staffing
- Castor on Call person
- Rob
- Staff absence/out of the office:
- (Mon-Wed) Matthew@CERN
- (Mon-Tue) Shaun@WLCG meeting