Difference between revisions of "RAL Tier1 weekly operations castor 12/08/2013"
From GridPP Wiki
(No difference)
|
Latest revision as of 10:14, 9 August 2013
Contents
Operations News
- Test for distributed oracle transaction bug now implemented and running across all production stagers
- Repack successfully upgraded to 2.1.13-9
- Draining of ATLAS servers is ongoing and so far seems problem free
- Work still ongoing getting HBASE logging working
- Currently moving data at 2GB/s between cmsTape and cmsDisk
Operations Problems
- Increasing number of pending jobs within CMS
- ATLAS deletion problems seem to be outside of RAL. Firewall sent FIN packet but clearly not received at destination
- Two disk servers out of production:
- gdss664 - Down one drive, waiting for replacement
- gdss720 - Rob to chase status when Kashif is back
Blocking Issues
- none
Planned, Scheduled and Cancelled Interventions
Entries in/planned to go to GOCDB
- none
- ATLAS renaming to RUCI namespace to start next week (Wednesday)
- Continue draining ATLAS disk servers - plan for 5 next week
- Start draining and decommissioning of cmsTape disk servers
- Deploy 5 new disk servers into lhcbUser
- Storage array behind CASTOR standby database needs firmware upgrade.
Advanced Planning
Tasks
- CASTOR 2.1.14 + SL6 testing
Interventions
- none
Staffing
- Castor on Call person
- Rob
- Staff absence/out of the office:
- Matthew A/L
- Shaun (Monday/Tuesday)