RAL Tier1 weekly operations Grid 20090720
From GridPP Wiki
Contents
Summary of Previous Week
Developments
- Derek
- Quattorised maui configuration
- FTS Channel changes
- Moved Support alias to point at Support queue
- Updated blogs' software
- Implement OPN ticket merging in Notifications queue
Operational Issues and Incidents
Description | Start | End | Affected VO(s) | Severity |
---|---|---|---|---|
lfc0448 - SMART errors detected | 2009-06-15 | Ongoing | All | Low |
lcgpx0619 - RAID failure | 2009-07-03 | Ongoing | All | Low |
helpdesk DB tables not backed up | 2009-07-01 | Ongoing | none | Medium |
lcgmon01 - SMART errors detected | 2009-06-15 | Ongoing | None | Medium |
Plans for Week(s) Ahead
Development Priorities
- Derek
- Continue quattorising torque server
- Update worker node software
- Matt
- Catchup
- Finish WLCG accounting
- Move MyProxy to backup host (Kash to fix disk, and set up hotswappable disks on both hosts)
- PPS post shortlisting
- Check quattor-generated Maui configuration
Resource Requests
Downtimes
Description | Start | End | Affected VO(s) |
---|---|---|---|
LFC ATLAS separation | 2009-07- |
2009-07- |
All |
Requirements and Blocking Issues
Description | Required By | Priority | Status |
---|---|---|---|
SL5 Worker Node Kickstart | High | Post-kickstart configuration needed; not yet suitable for bulk deployment | |
LB01 RAID failure | Medium | Testing hotswap configuration | |
lfc0448 disk failures | Medium | Disk replacement needed | |
Non-capacity HW for testing | Medium | Still using the old HW | |
Hardware for PPS | Medium | May need to deploy imminently |
OnCall/AoD Cover
- Primary OnCall
- Grid OnCall
- Derek (Mon-Wed); Matt (Thu-Sun)
- AoD