RAL Tier1 weekly operations Fabric 20110131

From GridPP Wiki
Jump to: navigation, search

Editing RAL Tier1 weekly operations Fabric 20110110

Developments

  • All:
  • Martin:
  • Ian:
  • Tim:
  • James A:
  • James T
    • Strategy meeting
    • Preparation for CMS SL5 upgrade
    • Disk sweep in Kash's absence
    • Security log searches
    • Project management course
  • Cheney
  • Kash:
    • Drive replacement.
    • Fixing broken WNs.
    • Decommissioning old batch systems.(R 27)
    • gdss380 add new mac address in dhcp and re-install.
    • gdss189 read-only filesystem.(Scsi errors)
    • gdss496 Scsi errors. Reported to Streamline with logs.(Intervention)
    • Tier1 Strategy meeting.
    • Fabric Hardware failure metrics.
    • Jetstor systems more drive failures.
    • lcgbdii0652 moved into UPS room with Richard.
    • gdss337 replaced 4x2gb memory. (Back into production)
    • gdss98 given back to Castor team.
    • gdss280 started Acceptance test. Replacement disk server for gdss283.
    • Clear Test area.
    • gdss435 replaced 4x2gb memory. Back into production.
    • lcgec01 replaced drive with hotswap method.
    • SL 2010 and Viglen 2010 disk servers in testing.
    • SL 2009 Auto rebuild on hotspare fails. Set rebuild priority from Low to High.


Operational Issues and Incidents

Index Description Start End Severity Affected VO(s)

Summary of plans for week ahead

Scheduled and Cancelled Down Times

Type=Down/At Risk/Cancelled entries in/planned to go to GOCDB

Component Description Start End Affected VO(s) Type

Development priorities

  • All
  • Martin:
  • Ian:
  • Tim:
  • Cheney
  • James T:
    • CMS SL5 upgrade
    • SL08 investigations
    • Puppet -> Quattor work
  • James A:
  • Kash:
    • Drive replacement.
    • Fixing broken WNs.
    • SL 2009 Auto rebuild on hotspare fails. Set rebuild priority from Low to High
    • Hardware failure metrics continue.
    • SL08 testing.
    • Continuous decommissioning old batch systems.(R 27)

Absences


Fabric On-Call

  • Monday - Sunday

Advanced Warning of Requirements and Blocking issues

Services Issues


RAL Tier1 weekly operations fabric

Category:RAL_Tier1