RAL T1 weekly ops Fabric 20110516

From GridPP Wiki
Jump to: navigation, search

Developments

  • All:
  • Tim:
  • James A:
    • Restructured disk server templates to remove duplication.
    • Created new Storage-D machine type for facilities instance.
    • Thinking about metrics and statistics.
  • Cheney
    • change ip addresses facilities arrays
    • investigate storageD lockup
    • certificates problem
    • fix some backups
    • apply errata to some facilities kit
    • docco for dmf dr
    • server stats
  • Kash:
    • Drive replacement.
    • Fixing broken WNs.
    • Decommissioning old batch systems.(R 27)
    • quattor02 showing same errors again.
    • Viglen 2007 all disk servers firmware update. (ongoing)
    • Update firmware on Jetstor systems.(ongoing) Updated on three.
    • gdss502 passed acceptance test. (Given back to Castor team)
    • Add (SL09, V09 and SL10) in Adaptec Storage Manager for monitoring.
    • SL08 testing 3 disk servers with multiple drives failure. Review with MJB.
    • gdss294 read only file-system.
    • gdss293 passed memory test.
    • Dell system from Castor Rack H reported. (IDRAC failure)
    • gdss206 replaced drives and rebuild completed. (Back into production)
    • Fabric hardware failure analyze meeting with Andrew and James.


  • Martin:
  • Ian:


Operational Issues and Incidents

Index Description Start End Severity Affected VO(s)

Summary of plans for week ahead

Scheduled and Cancelled Down Times

Type=Down/At Risk/Cancelled entries in/planned to go to GOCDB

Component Description Start End Affected VO(s) Type

Development priorities

  • All
  • Tim:
  • Cheney
    • Backups for SCT
  • James A:
    • Assisting with finalisation of Facilities instance configs.
    • Developing Job Plan.
    • Computer room tours.
  • Kash:
    • Drive replacement.
    • Fixing broken WNs.
    • Hardware failure metrics continue.
    • Continue SL08 testing.
    • Continuous decommissioning old batch systems.(R 27)
    • Continue Labelling racks and systems in UPS and HPD room


  • Martin:
  • Ian:

Absences

Fabric On-Call

  • Monday - Sunday

Advanced Warning of Requirements and Blocking issues

Services Issues


RAL Tier1 weekly operations fabric

Category:RAL_Tier1