Difference between revisions of "RAL Tier1 weekly operations castor 14/04/2014"
From GridPP Wiki
(Created page with "== Operations News == * Facilities CASTOR was successfully upgraded to 2.1.14-11 * 2.1.14 upgrade has been repeated on Preprod - this time with the NS Compatibility flag enabl...") |
|||
(One intermediate revision by one user not shown) | |||
Line 1: | Line 1: | ||
== Operations News == | == Operations News == | ||
− | * Facilities CASTOR | + | * The NN_FILE_STAGERTIME constraint has been removed for the Facilities CASTOR database, completing the 2.1.14 upgrade. This upgrade was thought to be transparent, but some daemons didn't reconnect, TM and VMGR is particular. This was fixed by restarting services. |
* 2.1.14 upgrade has been repeated on Preprod - this time with the NS Compatibility flag enabled - as it will be in Tier 1 when we do staggered upgrades across the instances after the initial NS upgrade | * 2.1.14 upgrade has been repeated on Preprod - this time with the NS Compatibility flag enabled - as it will be in Tier 1 when we do staggered upgrades across the instances after the initial NS upgrade | ||
+ | * The xrootd timeout in castor.conf is now set to 30s for all nodes. | ||
== Operations Problems == | == Operations Problems == | ||
* 2.1.14 bug was uncovered by Facilities where DiskManager timout (set to 2min) prevented recalled files being returned to users. We've disabled this timeout. | * 2.1.14 bug was uncovered by Facilities where DiskManager timout (set to 2min) prevented recalled files being returned to users. We've disabled this timeout. | ||
+ | * gdss673 failed after draining and has been removed from CASTOR for Fabric intervention. | ||
+ | * An ATLAS user caused a callout by specifying an incorrect space token on write. | ||
== Blocking Issues == | == Blocking Issues == | ||
Line 17: | Line 20: | ||
* Atlas would like to store c2 million EVNT monte carlo files – Brian to discuss with Alastair. Other tier 1s are not keen but RAL tier 1 / castor should be able to cope with this. | * Atlas would like to store c2 million EVNT monte carlo files – Brian to discuss with Alastair. Other tier 1s are not keen but RAL tier 1 / castor should be able to cope with this. | ||
+ | * CASTOR 2.1.14 for Tier 1 | ||
'''Interventions''' | '''Interventions''' | ||
Line 22: | Line 26: | ||
== Staffing == | == Staffing == | ||
* Castor on Call person | * Castor on Call person | ||
− | ** | + | ** Rob |
* Staff absence/out of the office: | * Staff absence/out of the office: | ||
− | ** (Mon | + | ** (Mon) Chris A/L |
− | ** (Mon- | + | ** (Mon-Tues) Matt A/L |
− | ** ( | + | ** (Mon-Thu) Shaun A/L |
Latest revision as of 10:09, 15 April 2014
Contents
Operations News
- The NN_FILE_STAGERTIME constraint has been removed for the Facilities CASTOR database, completing the 2.1.14 upgrade. This upgrade was thought to be transparent, but some daemons didn't reconnect, TM and VMGR is particular. This was fixed by restarting services.
- 2.1.14 upgrade has been repeated on Preprod - this time with the NS Compatibility flag enabled - as it will be in Tier 1 when we do staggered upgrades across the instances after the initial NS upgrade
- The xrootd timeout in castor.conf is now set to 30s for all nodes.
Operations Problems
- 2.1.14 bug was uncovered by Facilities where DiskManager timout (set to 2min) prevented recalled files being returned to users. We've disabled this timeout.
- gdss673 failed after draining and has been removed from CASTOR for Fabric intervention.
- An ATLAS user caused a callout by specifying an incorrect space token on write.
Blocking Issues
- none
Planned, Scheduled and Cancelled Interventions
Entries in/planned to go to GOCDB none
Advanced Planning
Tasks
- Atlas would like to store c2 million EVNT monte carlo files – Brian to discuss with Alastair. Other tier 1s are not keen but RAL tier 1 / castor should be able to cope with this.
- CASTOR 2.1.14 for Tier 1
Interventions
Staffing
- Castor on Call person
- Rob
- Staff absence/out of the office:
- (Mon) Chris A/L
- (Mon-Tues) Matt A/L
- (Mon-Thu) Shaun A/L