RAL Tier1 weekly operations castor 07/04/2014
From GridPP Wiki
Revision as of 13:17, 4 April 2014 by Matthew Viljoen 83b6101d7f (Talk | contribs)
Contents
Operations News
- Facilities CASTOR was successfully upgraded to 2.1.14-11
Operations Problems
- CMS load continues to cause problems, we had to restart transfer/diskmanagers to get things working again (Monday 10:45 and Tuesday 17:30)
- transfermanagerd restarted on fdscdlf02 Thursday
- vcert srm and name server not accessible due to issues with hypervisor after rack move, possibly some config required to bring it back. Dimitrios is looking into this
- We had a node crash on Neptune causing brief issues with Atlas srm, known issue has already been logged with Oracle
Blocking Issues
- none
Planned, Scheduled and Cancelled Interventions
Entries in/planned to go to GOCDB none
Advanced Planning
Tasks
- Atlas would like to store c2 million EVNT monte carlo files – Brian to discuss with Alastair. Other tier 1s are not keen but RAL tier 1 / castor should be able to cope with this.
Interventions
Staffing
- Castor on Call person
- (Mon-Wed) Matthew
- (Thu-Fri) Rob?
- Staff absence/out of the office:
- (Mon-Fri) Chris A/L
- (Mon-Wed) Matt in DL then First Aid training
- (Thu-Fri) Matt A/L