RAL Tier1 weekly operations Fabric 20110328
From GridPP Wiki
Contents
Developments
- All:
- Martin:
- Ian:
- Planning rollout of latest CVMFS client
- Rolled out CVMFS upgrade on 1 cluster to test
- Setting up FreeNAS & configured iscsi tested and had live migration and failover working with hyperv
- Tim:
- James A:
- Receiving handover from James T
- Tour for SuperB representatives.
- Preparing for worker node deployment.
- James T:
- Handover
- Documentation
- Tours
- Cheney
- DMF DR
- research alternatives to DMF
- created DMF DR docco even though i can't get it to work...
- fix database backups
- fix zora access
- fix dmf half-dead disk
- fix hinode webserver down
- fix hinode webstats out of date
- set up nfs for greg matthews
- set up another virtual machine for tessella testing
- Kash:
- Drive replacement.
- Fixing broken WNs.
- Decommissioning old batch systems.(R 27)
- Test room review. (Every Monday morning)
- gdss496 created Raid1 arrays.
- Configure StroMan on SL09 disk servers.
- lcg0851-852 sent to Clustervision for fix.
- Update firmware on Jetstor systems.(ongoing) Updated on three.
- logger01 re-created raid10 array after replacing 3 drives.
- gdss150 and gdss460 given back to Castor team.
- Disk handover with James T and A.
- SL08 testing stopped due to IP change.
- Labelling racks and systems in UPS and HPD room.
Operational Issues and Incidents
Index | Description | Start | End | Severity | Affected VO(s) |
---|
Summary of plans for week ahead
Scheduled and Cancelled Down Times
Type=Down/At Risk/Cancelled entries in/planned to go to GOCDB
Component | Description | Start | End | Affected VO(s) | Type |
---|
Development priorities
- All
- Martin:
- Ian:
- GridPP 26
- Storage workshop
- Prep for Atlas sw week
- Further work on services virtualisation
- Tim:
- Cheney
- DMF DR
- job plan tasklets
- James T:
- Help with CASTOR 2.1.10-0 upgrade
- Handover
- Covering for Kash in his absence on Wednesday/Thursday
- Any last little bits of documentation
- James A:
- Beginning roll-out of new worker nodes into production.
- GridPP 26 in Sussex (Tuesday to Thursday).
- Kash:
- Drive replacement.
- Fixing broken WNs.
- Hardware failure metrics continue.
- Continue SL08 testing.
- Continuous decommissioning old batch systems.(R 27)
- Continue Labelling racks and systems in UPS and HPD room.
Absences
- Ian at GridPP 26 & Storage Workshop Tuesday-Thursday
- James A at GridPP 26 & Storage Workshop Tuesday-Thursday
- Kash A/L Wednesday-Thursday
Fabric On-Call
- Ian Fabric on-call Monday - Sunday