RAL Tier1 weekly operations Fabric 20100329
From GridPP Wiki
Contents
Developments
- All:
- Martin:
- Preperation for Open Day talk
- Work on HEPiX Virtualisation Working Group distribution method proposal
- Work on Castor database futures
- Laptop transfer
- Change controls notifications
- Ian:
- Helped James with SL5 disk server dependencies
- Contributed to QUattor QuestNet networking bid
- Documentation of Tier1 Quattor instance
- Tim:
- Cheney:
- cleaning machine room
- investigate sls timeouts
- build new robot controller
- fix zfs on new robot controller
- investigate oracle install problems
- check over castor151 backups
- relocate fibre channel switches
- replace failed drive in vtl
- fix backup problems on nagger
- bring up tape servers after mir problems
- James T:
- Testing Viglen 09 Kit
- SL5 64-bit + XFS quattor disk server build
- Tier1 tour prep
- Jonathan:
- continued work on disposals
- fixed atlasbackup problems on some nodes
- updated root SSH authorized keys across farm
- ran scans of log files after security alert
- Nagios configuration updates
- continued work on Quattor-managed Nagios slave server
- James A:
- SL54 Upgrade progressing.
- Tier1 Tour Preperations.
- Batch system training session.
- Continued testing of Viglen Worker Nodes.
- Kash:
- Drive replacement.
- Fixing broken WNs.
- Decommissioning old batch systems.(R 27)
- Moved and packed rack sliders from R27 to R89 for return.(wrong sliders)
- Mac addresses of Dell new 13 systems. (For MJB)
- Streamline engineers replaced few drives and took 2 disk servers. (gdss483 and 494)
- Castor servers (cdbc13) still working. (Intervention)
- install01 (intervention)
Absences
- Jonathan on partial retirement (not in on Monday and Friday)
Operational Issues and Incidents
Index | Description | Start | End | Severity | Affected VO(s) |
---|
Summary of plans for week ahead
Scheduled and Cancelled Down Times
Type=Down/At Risk/Cancelled entries in/planned to go to GOCDB
Component | Description | Start | End | Affected VO(s) | Type |
---|
Development priorities
- All
- Martin:
- Open Day + talk
- Prep HEPiX site report
- Work on HEPiX VWG proposal
- Ian:
- Further work on Castor servers in Quattor
- Help ChrisK apply new lsf licenses
- Work on Virtualisation Platform
- Tim:
- Cheney:
- Build new robot controller
- James T:
- Tier1 tours
- SL5, 64-bit, XFS disk server build
- Jonathan:
- Open Day and OPB tours
- continue reconfiguration of nagios06
- continue work on disposal of old kit from A1 Upper machine room
- James A:
- Tier1 OPBs and open day.
- Continuation of SL54 upgrade.
- Continued testing of Viglen Worker Nodes.
- Kash:
- Drive replacement.
- Fixing broken WNs.
- Continuous decommissioning old batch systems.(R 27)
Absences
- Jonathan on partial retirement (not in on Monday and Friday)
- James T annual leave Wednesday & Thursday.
Fabric On-Call
Ian Fabric on call Monday - Wednesday
James A Fabric on call Thursday-Monday