RAL T1 weekly ops Fabric 20110502
From GridPP Wiki
Revision as of 14:38, 9 May 2011 by Kashif hafeez (Talk | contribs)
Contents
Developments
- All:
- Martin:
- Ian:
- Tim:
- James A:
- Cheney
- DMF DR - successful roundtripping of data
- set up vmbs
- Fixed some tape server problems
- fixed some backups problems
- Solaris amanda testing
- Kash:
- Drive replacement.
- Fixing broken WNs.
- Decommissioning old batch systems.(R 27)
- quattor02 no hardware faulty found by Dell. (Updated IDRAC6 firmware and Raid card driver)
- Viglen 2007 all disk servers firmware update. (ongoing)
- Update firmware on Jetstor systems.(ongoing) Updated on three.
- gdss502 replaced raid card with help of James.
- Use Adaptec Storage Manager to monitor Storage servers. (SL09, V09 and SL10)
- SL08 testing more drive failures.
- APR with MJB.
- gdss293 fsprobe errors. (Draining)
- ADS3 array multiple drives failure (Port 14 & 12)
Operational Issues and Incidents
Index | Description | Start | End | Severity | Affected VO(s) |
---|
Summary of plans for week ahead
Scheduled and Cancelled Down Times
Type=Down/At Risk/Cancelled entries in/planned to go to GOCDB
Component | Description | Start | End | Affected VO(s) | Type |
---|
Development priorities
- All
- Martin:
- Ian:
- Tim:
- Cheney
- James A:
- Kash:
- Drive replacement.
- Fixing broken WNs.
- Hardware failure metrics continue.
- Continue SL08 testing.
- Continuous decommissioning old batch systems.(R 27)
- Continue Labelling racks and systems in UPS and HPD room.
- Book review meeting with Andrew and James for Fabric metrics for other hardware failures.
Absences
Fabric On-Call
- Monday - Sunday : Kashif