RAL Tier1 weekly operations Fabric 20091214
From GridPP Wiki
Contents
Summary of week gone
Developments
- All:
- Martin:
- Minor procurements
- Preperation for Tier1 Review
- Ian:
- Castor LSF triplet
- James T:
- Jonathan:
- James A:
- Kash:
- Drive replacement.
- Fixing broken WNs.
- Decommissioning old batch systems.(R 27)
- gdss138 fixed and given back to castor.
- gdss89, 107 and 167 given back to castor.
- gdss77 replaced memory (borrowed from gdss86) with James T and James A.(Back to castor)
- gdss367 replaced raid card battery and given back to castor.
- gdss339 replaced 4x2gb memory fixed and given back to castor.
- gdss196 replaced 8x1gb memory fixed and given back to castor.(Ready for deployment)
- Working on 2008 Disk servers and working nodes.
- Working on gdss105, 171 and 282.
Absences
Operational Issues and Incidents
Index | Description | Start | End | Severity | Affected VO(s) |
---|---|---|---|---|---|
EMC arrays serving 3D/LFC/FTS databases made unstable by attempts to stabilise the Castor EMC arrays | Tuesday 6/0ct am | UPS issues to be fixed | Catastrophic | All |
Summary of plans for week ahead
Scheduled and Cancelled Down Times
Type=Down/At Risk/Cancelled entries in/planned to go to GOCDB
Component | Description | Start | End | Affected VO(s) | Type |
---|
Development priorities
- All
- Monday: Tier1 Review as required
- Martin:
- Minor procurements
- Ian:
- James T:
- Jonathan:
- James A:
- Kash:
- Drive replacement.
- Fixing broken WNs.
- Continuous decommissioning old batch systems.(R 27)
- Continuous working on 2008 disk servers and working nodes.
- Continuous working on gdss105, 171 and 282.
Absences
- Ian, Jonathan: S/L Monday
- Kashif: A/L Thursday
Fabric On-Call
Advanced Warning of Requirements and Blocking issues
- Unable to proceed with Atlas TAG migration to 64bit due to arrays being used for 3D systems while EMC kit is flakey.
Services Issues
- Various requests for hardware.
- Working on various hardware requests for Services team.