RAL Tier1 weekly operations castor 17/12/2012
From GridPP Wiki
Contents
Operations News
- ATLAS federated xroot tested to be working using a second xroot redirector on a virtual machine. A change has been submitted for this new functionality to move into production in the new year.
- Memory leak on ATLAS/Gen stagers went away shortly after ORACLE stats collection was stopped. Investigations about the possible link will continue in the new year.
- CIP upgraded to fix UNDEFINED path bug affecting LHCb
Operations Problems
- (Wed) Stuck subrequests in the transfer manager brought ATLAS instance to a standstill for about 2 hours. The root cause was not understood - it could have been due to draining. Deleting latest subrequests fixed the problem: https://wiki.e-science.cclrc.ac.uk/web1/bin/view/Castor/CastorProcedures#Stuck_Subrequests_in_Transfer_Ma
Blocking Issues
none
Planned, Scheduled and Cancelled Interventions
Entries in/planned to go to GOCDB none
Advanced Planning
Tasks
- Simplify and document Quattor templates to make them easier to maintain
- Test and certify 2.1.13-5 with simplified Quattor templates
Interventions
- Upgrade stagers from 2.1.12 to 2.1.13 and central services (NS,CUPV,VDQM) from 2.1.11 to 2.1.13
Staffing
- Castor on Call person
- Matthew
- Staff absence/out of the office:
- (Wed) Rob A/L
- (Fri) MV TOIL+A/L