RAL Tier1 weekly operations Fabric 20110411

From GridPP Wiki
Jump to: navigation, search

Developments

  • All:
  • Martin:
  • Ian:
    • At CERN
    • Atlas SW Workshop talk about CVMFS
    • Presented CVMFS security review to GDB
    • Meetings with CVMFS developers
    • Worked with Steve Traylen on CVMFS service deployment and monitoring
  • Tim:
  • James A:
  • Cheney
    • DMF DR
  • Kash:
    • Drive replacement.
    • Fixing broken WNs.
    • Decommissioning old batch systems.(R 27)
    • Test room review. (monthly)
    • gdss496 need to install smartd tool.
    • quattor02 updated firmware on all drives.
    • gdss481 and gdss488 given back to Castor team. (Fixed)
    • Update firmware on Jetstor systems.(ongoing) Updated on three.
    • gdss502 found two drives with lots of medium errors.
    • gdss426 given back to Castor team. (Fixed)
    • APR..
    • SL08 testing started again by James T. So far 3 drives failure with no crash.
    • gdss103 'hardware error' out of production and services. (Ready for decommission)


Operational Issues and Incidents

Index Description Start End Severity Affected VO(s)

Summary of plans for week ahead

Scheduled and Cancelled Down Times

Type=Down/At Risk/Cancelled entries in/planned to go to GOCDB

Component Description Start End Affected VO(s) Type

Development priorities

  • All
  • Martin:
  • Ian:
    • Prepare CVMFS for Atlas and LHCb production
    • CVMFS callout documentation
    • APR
    • Start prep for Science Oxford talk
    • Prepare errata templates for Quattor managed systems


  • Tim:
  • Cheney
    • DMF DR
  • James A:
  • Kash:
    • Drive replacement.
    • Fixing broken WNs.
    • Hardware failure metrics continue.
    • Continue SL08 testing.
    • Continuous decommissioning old batch systems.(R 27)
    • Continue Labelling racks and systems in UPS and HPD room.

Absences

  • Thursday & Friday - Kashif
  • Martin - Monday at least

Fabric On-Call

  • Monday Ian
  • Tuesday - Sunday : Kashif

Advanced Warning of Requirements and Blocking issues

Services Issues


RAL Tier1 weekly operations fabric

Category:RAL_Tier1