RAL Tier1 weekly operations Fabric 20110214

From GridPP Wiki
Revision as of 16:37, 21 February 2011 by Kashif hafeez (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Developments

  • All:
  • Martin:
  • Ian:
  • Tim:
  • James A:
  • James T
    • Correct installation of V10 machines with Quattor
    • WAN tuning on cmsWanIn and cmsWanOut
    • SL08 testing
    • Investigating blocking processes
  • Cheney
  • Kash:
    • Drive replacement.
    • Fixing broken WNs.
    • Decommissioning old batch systems.(R 27)
    • gdss380 add new mac address in dhcp and re-install.
    • Change control for Adaptec raid cards. (SL09 & SL10)
    • gdss496 start Acceptance test.(Intervention)
    • High battery temperature messages on V10 and SL10 disk servers.
    • Fabric Hardware failure metrics.
    • Update firmware on Jetstor systems.
    • gdss502 drives failure and failed stripes. (Started verify fix)
    • gdss510 faulty motherboard. Reported to Streamline.
    • gdss66 given back to Castor team.
    • gdss280 passed acceptance test and put back into production.
    • SL 2009 Auto rebuild on hotspare fails. Set rebuild priority from Low to High.


Operational Issues and Incidents

Index Description Start End Severity Affected VO(s)

Summary of plans for week ahead

Scheduled and Cancelled Down Times

Type=Down/At Risk/Cancelled entries in/planned to go to GOCDB

Component Description Start End Affected VO(s) Type

Development priorities

  • All
  • Martin:
  • Ian:
  • Tim:
  • Cheney
  • James T:
    • Gen upgrade to SL5 64-bit
    • Apply new WAN tuning to all CMS disk servers
    • Disk servers as iSCSI targets
    • SL08 testing
    • Re-install all V10/SL10 machines
    • A/L Thursday
  • James A:
  • Kash:
    • Drive replacement.
    • Fixing broken WNs.
    • SL 2009 Auto rebuild on hotspare fails. Set rebuild priority from Low to High
    • Hardware failure metrics continue.
    • SL08 testing.
    • Continuous decommissioning old batch systems.(R 27)

Absences

  • James T A/L Thursday

Fabric On-Call

  • Monday - Sunday

Advanced Warning of Requirements and Blocking issues

Services Issues


RAL Tier1 weekly operations fabric

Category:RAL_Tier1