RAL Tier1 weekly operations Fabric 20101011

From GridPP Wiki
Revision as of 14:17, 25 October 2010 by Kashif hafeez (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Developments

  • All:
  • Martin:
  • Ian:
  • Tim:
  • Jonathan:
  • James A:
  • James T
  • Cheney
    • quatting the facilities
    • fix castor151
    • copy over database archive logs


  • Kash:
    • Drive replacement.
    • Fixing broken WNs.
    • Decommissioning old batch systems.(R 27)
    • gdss110 fsprobe errors. (Acceptance testing)
    • gdss380 failed acceptance test with new raid card as well.(Crashed with single faulty drive)
    • gdss417 started acceptance testing. (Crashed with single faulty drive)
    • Changed network settings in BIOS of Streamline 2009 disk servers.
    • gdss280 crashed during acceptance testing. (Probably raid card)
    • Arranged Streamline engineers visit for gdss490. Received and back into rack.
    • Updated post-mortem for gdss280 & gdss417.
    • Hardware failure stats/graphs.
    • Fixed couple of new Dell machines.
    • gdss512 received back from LSI. (USA)
    • Streamline 2009 disk server testing in absence of James T.
    • Streamline/areca disk servers crashed due to single faulty drive. (ongoing)

Absences

  • Jonathan on partial retirement (not in on Monday and Friday)

Operational Issues and Incidents

Index Description Start End Severity Affected VO(s)

Summary of plans for week ahead

Scheduled and Cancelled Down Times

Type=Down/At Risk/Cancelled entries in/planned to go to GOCDB

Component Description Start End Affected VO(s) Type

Development priorities

  • All
  • Martin:
  • Ian:
  • Tim:
  • Cheney
    • quatt the facilities


  • Jonathan:
  • James T:
    • A/L until 14th October
  • James A:
  • Kash:
    • Drive replacement.
    • Fixing broken WNs.
    • Update daily status of Streamline 2009 disk servers testing.
    • Continuous decommissioning old batch systems.(R 27)

Absences

  • James T on A/L until 14th October
  • Jonathan on partial retirement (not in on Monday and Friday)

Fabric On-Call

  • Kashif Hafeez

Advanced Warning of Requirements and Blocking issues

Services Issues


RAL Tier1 weekly operations fabric

Category:RAL_Tier1