RAL T1 weekly ops Fabric 20110808

From GridPP Wiki
Jump to: navigation, search

Developments

  • All:
  • Tim:
  • James A:
  • Cheney


  • Kash:
    • Drive replacement.
    • Fixing broken WNs.
    • Decommissioning old disk servers/batch systems.
    • Southampton university visit. (Server lifter)
    • gdss190 updated firmware on both raid cards and re-created raid arry.
    • MTI engineer visit. PSU replaced.
    • gdss487 crashed with fs errors. (Port 22 failed)
    • gdss435 replaced memory.
    • Quattor02 sent logs to Dell for further investigation.
    • gdss540 started verify fixed. (While in production)


  • Martin:
  • Ian:


Operational Issues and Incidents

Index Description Start End Severity Affected VO(s)

Summary of plans for week ahead

Scheduled and Cancelled Down Times

Type=Down/At Risk/Cancelled entries in/planned to go to GOCDB

Component Description Start End Affected VO(s) Type

Development priorities

  • All
  • Tim:
  • Cheney
  • James A:
  • Kash:
    • Drive replacement.
    • Fixing broken WNs.
    • Hardware failure review and metrics continue.
    • Continuous decommissioning old disk servers/batch systems.(R 27)
    • Continue Labelling racks and systems in UPS and HPD room


  • Martin:
  • Ian:

Absences

  • Kashif A/L (Monday, Thursday and Friday)

Fabric On-Call

Advanced Warning of Requirements and Blocking issues

Services Issues


RAL Tier1 weekly operations fabric

Category:RAL_Tier1