RAL Tier1 weekly operations castor 10/02/2014
From GridPP Wiki
Revision as of 08:06, 12 February 2014 by Matt viljoen (Talk | contribs)
Contents
Operations News
- XROOT with GSI authentication is now enabled on Gen and has been successfully used by T2K
- The new headnodes for preprod are ready to be deployed.
- Testing of 2.1.14 ongoing.
Operations Problems
- We have a persistent problem showing up on our SRMs where queries are not getting a response from the name server. There is a very strong correlation between the incidence of these errors and typical working hours (9am-5am weekdays). Investigations into the cause of this are ongoing.
- We had a callout on the CMS SRMs on Saturday which was related to the ongoing FTS testing.
- LHCb has noticed a sudden rise in Input Data Resolution errors at RAL. A new user's DN had not been added to our grid-mapfiles. Investigations showed that the hostcert had expired on lcgccvm02, stopping VOMS handshake when updating grid-mapfiles. We have now implemented a Nagios cert lifetime check on this box.
Blocking Issues
- none
Planned, Scheduled and Cancelled Interventions
Entries in/planned to go to GOCDB
Advanced Planning
Tasks
- CASTOR 2.1.14 + SL5/6 testing
- iptables to be installed on lcgcviewer01 to harden the logging system against the injection of junk data by security scans.
- Quattor cleanup process is ongoing.
- Installation of new Preprod headnodes
Interventions
- none
Staffing
- Castor on Call person
- Matt
- Staff absence/out of the office:
- None