RAL Tier1 weekly operations Fabric 20101213

From GridPP Wiki
Jump to: navigation, search

Developments

  • All:
  • Martin:
    • Some work on CPU deliveries
    • Prep for Atlas poweroff weekend
    • Work on database migration plans
    • Babysitting the castor database backups
  • Ian:
    • Add detail to public cvmfs page
    • Got cvmfs mirroring prototype working
    • Rebuilding hyper-v cluster
    • Job plan reviews
  • Tim:
    • 9940 draining
    • ADS shutdown work
    • DMF futures planning
  • James A:
    • Supervising deliveries and installation of new ClusterVision and Viglen worker nodes.
    • Shutting down and moving servers for Atlas power down.
  • James T
    • ATLAS Castor upgrade
    • ATLAS power outage work
    • Learning about AFS
    • Disk deployment
    • Disk server fixing in Kash's absence
    • Fixed LHCb/ATLAS disk server ganglia monitoring
  • Cheney
    • S/L
  • Kash:


Operational Issues and Incidents

Index Description Start End Severity Affected VO(s)

Summary of plans for week ahead

Scheduled and Cancelled Down Times

Type=Down/At Risk/Cancelled entries in/planned to go to GOCDB

Component Description Start End Affected VO(s) Type

Development priorities

  • All
  • Martin:
    • Recovery from Atlas poweroff weekend
    • Investigate cacti/SAR issue post shutdown
    • Receipts for capacity procurements + invoicing
    • Babysitting the castor database backups
  • Ian:
    • Investigate nagios checks for cvmfs client
    • Rebuilding hyperv cluster
    • Add remaining repositories to cvmfs mirror and automate updates
    • Job plan re-reviews
  • Tim:
    • New Oracle help system.
    • remove VTL from DMF
    • Get IPMI working again on DMF
  • Cheney
    • S/L
  • James T:
    • Disks in Kash's absence
    • Intervention on BaBar disk servers
    • Viglen 2010 testing
    • Delivery of remainder of Streamline 2010 machines.
  • James A:
    • Installing servers brought over from Atlas.
    • Performing various upgrades and reboots.
  • Kash:

Absences

  • Cheney - Away for most of December.
  • Kash - On leave on Monday.

Fabric On-Call

  • James Thorne (also primary).

Advanced Warning of Requirements and Blocking issues

Services Issues


RAL Tier1 weekly operations fabric

Category:RAL_Tier1