RAL Tier1 weekly operations Overview 20090914

From GridPP Wiki
Jump to: navigation, search

Overview of Milestones and Metrics

Key High Level dates

  • LHC schedule See: http://www.gridpp.ac.uk/gridpp23/gridpp23_LHCstatus.ppt
    • Repairs continue - very challenging program of work
    • Injection tests, late October.
    • LHC ready for beam by mid-November.
    • Commissioning needs 28 days beam time (factor 2 elapsed)
    • Standby December 19th
    • Restart recommissioning commences January 4th
    • Beam 11th January
      • Scheduled 3 day downtime (Monday-Wed every 4 weeks) restart on the Thursday.
    • Run until September the 2 weeks down while prepare for heavy ions
    • Run ends in October
  • Tier-1 software freeze date end of September

Key Metrics

Owner Description Target Achieved
Gareth Smith Overall Tier-1 SAM Availability (last week) 97% 99%
Gareth Smith Alice SAM Availability (Aug) 97% 77%
Gareth Smith ATLAS SAM Availability (Aug) 97% 75%
Gareth Smith CMS SAM availability (Aug) 97% 77%
Gareth Smith LHCB SAM availability (Aug) 97% 78%
Andrew Sansum Fraction of Tier-1 Staff in Post (Aug) 93% 103%
Gareth Smith Number of days where called out (last spreadsheet full week) 3 3
Matt Hodges Percentage met of UB allocation of disk (Aug) 100%
Matt Hodges Job Efficiency (Aug) 85% 67%
Matt Hodges Farm Occupancy (Aug) 85% 41%
Matt Viljoen Number of >Severe CASTOR Incidents (Aug) 6 1

Key Production Milestones

See myactions:

https://myactions.gridpp.rl.ac.uk/all/where/category_name/Operational/

High Level Schedule

Final Update Window					Mon 13/07/09	 30/09/09
Tier-1 Stability Period (2)				October-mid-November
LHC First beam				        	mid November

Disaster Management

  • Swine Flu (H1N1) downgraded to level 1. No regular meetings, will re-activate when case frequency increases
  • Disk deployment (level 2) ongoing testing with Viglen. Increasing likelihood that we will escalate to L3 if no progress soon.
  • Machine room air-conditioning. Formally level 3, but progress being made in understanding what happened. Likilly to downgrade to level 2.
  • Water leak


Swine Flu Response Plan

See: https://wiki.e-science.cclrc.ac.uk/web1/bin/view/EScienceInternal/TierOneSwineFlu

Purchasing and Finance

  • GRIDPP finalising high level spend plan.
  • Disk tender at ITT stage.
  • CPU PQQ closes this week.
  • Tape drive purchases need to start soon.
  • Beginning to build spend plan. Propose we cost a pre-production/virtualisation testbed this week.

Staffing

  • Second experiment support post, start date 5th October.

PMB Experiment Reports

ATLAS

CMS

LHCB

Hardware Deployment Report

Team Reports

Fabric

RAL Tier1 weekly operations Fabric 20090914

Grid Services

http://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_Grid_20090914

CASTOR

http://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor_14/09/2009

Database

http://www.gridpp.ac.uk/wiki/Operations_Report_14/09/2009

Production

Production Team Report 2009-09-14