RAL Tier1 weekly operations castor 09/12/2013
From GridPP Wiki
Revision as of 21:23, 6 December 2013 by Matt viljoen (Talk | contribs)
Contents
Operations News
- Stress testing from the batch farm (based on xrootd) has started against 2.1.14-5. No major issues found yet.
- 2.1.14 logging browser prototype now working with pure rsyslog+Logstash (i.e. no HBASE/HDFS)
- Testing with the new SHA2 certificates (personal and host) with CASTOR preprod has confirmed that everything works.
Operations News (ongoing)
- We have decreased CASTOR overhead to 1% on 5 production disk servers. We are still waiting until they fill up and will be monitoring them closely before rolling out the change to everything.
Operations Problems
- A small number (10^2) ATLAS files identified as lost during their mass rename - but nothing recently.
Blocking Issues
- none
Planned, Scheduled and Cancelled Interventions
Entries in/planned to go to GOCDB
- none
Advanced Planning
Tasks
- CASTOR 2.1.14 + SL5/6 testing
Interventions
- none
Staffing
- Castor on Call person
- Matthew
- Staff absence/out of the office:
- (Mon/Tue) Shaun, Rob, Bruno, Chris, Tim at CASTOR F2F at CERN
- (Mon/Tue) Matthew working from home; available for operations support.
- (Thu AM) Matthew A/L
- (Fri) Matthew A/L