RAL Tier1 weekly operations Fabric 20110321

From GridPP Wiki
Jump to: navigation, search

Developments

  • All:
  • Martin:
    • Procurements
  • Ian:
    • Quattor workshop at CERN
    • Informal meetings with CVMFS developers and Atlas
  • Tim:
  • James A:
    • Arranging Dell and HP engineers
    • Quattor workshop at CERN
    • Informal meetings with CVMFS developers and Atlas
  • James T
  • Cheney
    • DMF disaster recovery
    • glitches in database backups copies
    • send for disks for jetstors
    • help greg matthews with rsync
    • setup virtual machine for tessela testing
    • fix crontab problems on offline tape controller
    • some jobplan tasks
  • Kash:
    • Drive replacement.
    • Fixing broken WNs.
    • Decommissioning old batch systems.(R 27)
    • Test room review. (Every Monday morning)
    • gdss496 completed badblocks test.
    • Configure StroMan on Sl09 disk servers.
    • Motherboard replaced in lcg1233.(Engineer)
    • Update firmware on Jetstor systems.(ongoing) Updated on three.
    • Moved lcgbdii0632 into ups room with Richard.
    • Check Clustervision new batch systems. (Testing)
    • Disk catchup with James T.
    • SL08 testing continue.
    • Labelling racks and systems in UPS and HPD room.



Operational Issues and Incidents

Index Description Start End Severity Affected VO(s)

Summary of plans for week ahead

Scheduled and Cancelled Down Times

Type=Down/At Risk/Cancelled entries in/planned to go to GOCDB

Component Description Start End Affected VO(s) Type

Development priorities

  • All
  • Martin:
    • Final procurements
    • DB systems deployment & testing


  • Ian:
    • Catching up
    • Planning rollout of latest CVMFS client
    • Setting up iscsi targets on disk server


  • Tim:
  • Cheney
    • DMF disaster recovery testing
    • DMF DR docco
  • James T:
  • James A:
    • Testing new worker nodes with production config
    • Improving test area connectivity
    • Preparing for GridPP next week
  • Kash:
    • Drive replacement.
    • Fixing broken WNs.
    • Hardware failure metrics continue.
    • Continue SL08 testing.
    • Continuous decommissioning old batch systems.(R 27)
    • Continue Labelling racks and systems in UPS and HPD room

Absences

  • Martin in training course Monday-Tuesday
  • James T on leave Monday

Fabric On-Call

  • Ian Primary on call Monday - Sunday

Advanced Warning of Requirements and Blocking issues

Services Issues


RAL Tier1 weekly operations fabric

Category:RAL_Tier1