RAL Tier1 weekly operations Grid 20090720

From GridPP Wiki
Jump to: navigation, search

Summary of Previous Week

Developments

  • Derek
    • Quattorised maui configuration
    • FTS Channel changes
    • Moved Support alias to point at Support queue
    • Updated blogs' software
    • Implement OPN ticket merging in Notifications queue

Operational Issues and Incidents

Description Start End Affected VO(s) Severity
lfc0448 - SMART errors detected 2009-06-15 Ongoing All Low
lcgpx0619 - RAID failure 2009-07-03 Ongoing All Low
helpdesk DB tables not backed up 2009-07-01 Ongoing none Medium
lcgmon01 - SMART errors detected 2009-06-15 Ongoing None Medium

Plans for Week(s) Ahead

Development Priorities

  • Derek
    • Continue quattorising torque server
    • Update worker node software
  • Matt
    • Catchup
    • Finish WLCG accounting
    • Move MyProxy to backup host (Kash to fix disk, and set up hotswappable disks on both hosts)
    • PPS post shortlisting
    • Check quattor-generated Maui configuration

Resource Requests

Downtimes

Description Start End Affected VO(s)
LFC ATLAS separation 2009-07-20 27 08:00 2009-07-20 27 17:00 All

Requirements and Blocking Issues

Description Required By Priority Status
SL5 Worker Node Kickstart High Post-kickstart configuration needed; not yet suitable for bulk deployment
LB01 RAID failure Medium Testing hotswap configuration
lfc0448 disk failures Medium Disk replacement needed
Non-capacity HW for testing Medium Still using the old HW
Hardware for PPS Medium May need to deploy imminently

OnCall/AoD Cover

  • Primary OnCall
  • Grid OnCall
    • Derek (Mon-Wed); Matt (Thu-Sun)
  • AoD