Difference between revisions of "RAL Tier1 weekly operations castor 05/05/2014"
From GridPP Wiki
Line 1: | Line 1: | ||
== Operations News == | == Operations News == | ||
− | * | + | * 3 new V'13 disk servers were deployed into cmsDisk. |
− | + | ||
− | + | ||
== Operations Problems == | == Operations Problems == | ||
− | * | + | * cmsDisk was very full, all but three recently added CV'13 disks were full hence resulted in timeouts and a string of callouts. CMS have since deleted many files which has improved matters. |
+ | * One of the 3 new V'13 disk servers installed in cmsDisk on 1st May has failed (others of this revision have also failed before going into production). Issue is currently bring investigated by fabric and remaining 2 servers are to stay in cmsDisk for now. | ||
+ | * A few SUM test failures for Atlas WE 26/27th April - cause not obvious and issue not reoccurred. | ||
== Blocking Issues == | == Blocking Issues == | ||
Line 11: | Line 11: | ||
== Planned, Scheduled and Cancelled Interventions == | == Planned, Scheduled and Cancelled Interventions == | ||
− | + | * CASTOR 2.1.14 upgrade for Tier 1. Possible date for first stage of intervention (NS upgrade) is May 27th. | |
− | * CASTOR 2.1.14 upgrade for Tier 1. | + | |
− | + | ||
* Deployment of 2013 generation disk servers. | * Deployment of 2013 generation disk servers. | ||
Line 19: | Line 17: | ||
'''Tasks''' | '''Tasks''' | ||
− | |||
* CASTOR 2.1.14 for Tier 1 | * CASTOR 2.1.14 for Tier 1 | ||
Line 26: | Line 23: | ||
== Staffing == | == Staffing == | ||
* Castor on Call person | * Castor on Call person | ||
− | ** Matt | + | ** Matt until Tuesday / Rob thereafter |
* Staff absence/out of the office: | * Staff absence/out of the office: | ||
− | ** | + | ** Chris out Tues/Wed |
Latest revision as of 15:56, 2 May 2014
Contents
Operations News
- 3 new V'13 disk servers were deployed into cmsDisk.
Operations Problems
- cmsDisk was very full, all but three recently added CV'13 disks were full hence resulted in timeouts and a string of callouts. CMS have since deleted many files which has improved matters.
- One of the 3 new V'13 disk servers installed in cmsDisk on 1st May has failed (others of this revision have also failed before going into production). Issue is currently bring investigated by fabric and remaining 2 servers are to stay in cmsDisk for now.
- A few SUM test failures for Atlas WE 26/27th April - cause not obvious and issue not reoccurred.
Blocking Issues
- none
Planned, Scheduled and Cancelled Interventions
- CASTOR 2.1.14 upgrade for Tier 1. Possible date for first stage of intervention (NS upgrade) is May 27th.
- Deployment of 2013 generation disk servers.
Advanced Planning
Tasks
- CASTOR 2.1.14 for Tier 1
Interventions
Staffing
- Castor on Call person
- Matt until Tuesday / Rob thereafter
- Staff absence/out of the office:
- Chris out Tues/Wed