RAL Tier1 weekly operations Grid 20110829

From GridPP Wiki
Jump to: navigation, search

Operational Issues

Description Start End Affected VO(s) Severity Status

Downtimes

Description Hosts Type Start End Affected VO(s)

Blocking Issues

Description Requested Date Required By Date Priority Status

Developments/Plans

Highlights for Tier-1 Ops Meeting

Highlights for Tier-1 VO Liaison Meeting

Detailed Individual Reports

Alastair

  • Working on permission change. [Ongoing]
  • Looking at Hammer Cloud test results across UK Cloud.
  • Frontier, testing new API, monitoring packagaes and helping deploy new box.

Andrew

  • Diskserver deployment for ATLAS & LHCb [Done]
  • Preparing for capacity signoff meeting [Done]
  • Added monitoring of disk/CPU in production to capacity planning system [Done]
  • Kernel / OS errata upgrades

Catalin

  • glite-LB updates [done]
  • work on VMs and HyperV [ongoing]
  • LHCb VOBOX updates [done]

VO Reports

ALICE

ATLAS

CMS

  • Software server was temporarily overloaded on 20th August causing some JobRobot and production jobs to fail. Caused by a period of higher than normal job start rate.

LHCb

OnCall/AoD Cover

OnCall Rota

  • Primary OnCall: Catalin (Mon - Fri)
  • Grid OnCall: Andrew (Sat - Sun)