RAL Tier1 weekly operations castor 24/02/2014
From GridPP Wiki
Revision as of 15:59, 24 February 2014 by Rob appleyard (Talk | contribs)
Contents
Operations News
- Now increased the threads on ATLAS Stager and Transfermanager (in addition to the NS) in an attempt to reduce occurrences of "threads busy with CASTOR" errors we see on the SRMs. The number has gone down, but it is unclear whether this is due to lighter load.
- T2K's tape recall problems have been fixed by some adjustments to CASTOR settings and an increased timeout on the transfers.
- The new disk server generation will be deployed into preprod for CASTOR testing in the next week.
Operations Problems
- We had some CMS SUM test failures between Tuesday and Thursday which were believed to be due to load on the disk servers.
Blocking Issues
- none
Planned, Scheduled and Cancelled Interventions
Entries in/planned to go to GOCDB
Advanced Planning
Tasks
- CASTOR 2.1.14 + SL5/6 testing. The change control has gone through today with few problems.
- iptables to be installed on lcgcviewer01 to harden the logging system against the injection of junk data by security scans.
- Quattor cleanup process is ongoing.
- Installation of new Preprod headnodes
Interventions
- none
Staffing
- Castor on Call person
- Matthew
- Staff absence/out of the office:
- Matt A/L (Thu)