RAL Tier1 weekly operations Fabric 20101115

From GridPP Wiki
Jump to: navigation, search

Developments

  • All:
  • Martin:
  • Ian:
  • Tim:
  • Jonathan:
  • James A:
  • James T
  • Cheney
    • blown drives replaced
    • make script to restore from tape for db group
    • script for db group backups checks
    • write nagios checks for castor fac
    • fix castor303
    • cutover sls to jiscmail server only
    • backups
  • Kash:
    • Drive replacement.
    • Fixing broken WNs.
    • Decommissioning old batch systems.(R 27)
    • gdss380 still with Streamline for fix.(Crashed with single faulty drive)
    • gdss417 acceptance testing. (Crashed with single faulty drive)
    • gdss280 crashed again with replacement raid card borrowed from gdss338. (Testing)
    • Annual Hearing review.
    • gdss117 failed during test.
    • Hardware failure stats/graphs.
    • Meeting with Gareth from Streamline about SL08 issues.
    • Streamline/areca disk servers crashed due to single faulty drive. (ongoing)


Operational Issues and Incidents

Index Description Start End Severity Affected VO(s)

Summary of plans for week ahead

Scheduled and Cancelled Down Times

Type=Down/At Risk/Cancelled entries in/planned to go to GOCDB

Component Description Start End Affected VO(s) Type

Development priorities

  • All
  • Martin:
  • Ian:
  • Tim:
  • Cheney
    • backups of various sorts
  • Jonathan:
  • James T:
  • James A:
  • Kash:
    • Drive replacement.
    • Fixing broken WNs.
    • Continuous decommissioning old batch systems.(R 27)

Absences

  • Jonathan on partial retirement (not in on Monday and Friday)
  • Cheney - changed date for being off - now Nov 24th - early warning -likely to be off most of december - date subject to change -

Fabric On-Call

  • Kashif Hafeez

Advanced Warning of Requirements and Blocking issues

Services Issues


RAL Tier1 weekly operations fabric

Category:RAL_Tier1