RAL T1 weekly ops Fabric 20110912

From GridPP Wiki
Jump to: navigation, search

Developments

  • All:
  • Tim:
  • James A:
  • Cheney


  • Kash:
    • Drive replacement.
    • Fixing broken WNs.
    • Decommissioning old disk servers/batch systems. (Viglen 2006 started)
    • Logger 1 multiple drives failure.
    • lcgce09 raid failure.
    • Create Change control for Viglen 2007 AMD disk servers.
    • Quattor02 replaced backplane and hard disk (Port 2). By Dell engineer.
    • Arranged collection with DHL and SGI.
    • Acceptance test completed on 7 Viglen 2007 AMD disk servers. (Passed)


  • Martin:
  • Ian:
    • Alan Kyffin induction
    • Helping Alan get started with Quattor and stratuslab
    • Reviewing disk tenders
    • Travel prep



Operational Issues and Incidents

Index Description Start End Severity Affected VO(s)

Summary of plans for week ahead

Scheduled and Cancelled Down Times

Type=Down/At Risk/Cancelled entries in/planned to go to GOCDB

Component Description Start End Affected VO(s) Type

Development priorities

  • All
  • Tim:
  • Cheney
  • James A:
  • Kash:
    • Drive replacement.
    • Fixing broken WNs.
    • Hardware failure review and metrics continue.
    • Continuous decommissioning old disk servers/batch systems.(R 27)
    • Continue Labelling racks and systems in UPS and HPD room.


  • Martin:
  • Ian:
    • Visiting CERN
    • Attending GridPP 27
    • Remote work on StratusLab
    • Reviewing disk tenders
    • Reviewing job applications

Absences

  • Ian at CERN Tuesday-Friday
  • James at CERN Tuesday-Friday

Fabric On-Call

  • Ian Primary OnCall Friday-Sunday

Advanced Warning of Requirements and Blocking issues

Services Issues


RAL Tier1 weekly operations fabric

Category:RAL_Tier1