RAL Tier1 weekly operations Overview 20090928

From GridPP Wiki
Jump to: navigation, search

Overview of Milestones and Metrics

Key High Level dates

    • Injection tests, late October.
    • LHC ready for beam by mid-November.
    • Commissioning needs 28 days beam time (factor 2 elapsed)
    • Standby December 19th
    • Restart recommissioning commences January 4th
    • Beam 11th January
      • Scheduled 3 day downtime (Monday-Wed every 4 weeks) restart on the Thursday.
    • Run until September then 2 weeks down while prepare for heavy ions
    • Run ends in October
  • Tier-1 software freeze date end of September

Key Metrics

Owner Description Target Achieved
Gareth Smith Overall Tier-1 SAM Availability (last week) 97% 100%
Gareth Smith Alice SAM Availability (Aug) 97% 77%
Gareth Smith ATLAS SAM Availability (Aug) 97% 75%
Gareth Smith CMS SAM availability (Aug) 97% 77%
Gareth Smith LHCB SAM availability (Aug) 97% 78%
Andrew Sansum Fraction of Tier-1 Staff in Post (Aug) 93% 103%
Gareth Smith Number of days where called out (last spreadsheet full week) 3 4
Matt Hodges Percentage met of UB allocation of disk (Aug) 100%
Matt Hodges Job Efficiency (Aug) 85% 67%
Matt Hodges Farm Occupancy (Aug) 85% 41%
Matt Viljoen Number of >Severe CASTOR Incidents (Aug) 6

Key Production Milestones

See myactions:

https://myactions.gridpp.rl.ac.uk/all/where/category_name/Operational/

High Level Schedule

Final Update Window					Mon 13/07/09	 30/09/09
Tier-1 Stability Period (2)				October-mid-November
LHC First beam				        	mid November

Disaster Management

  • Swine Flu (H1N1) downgraded to level 1. No regular meetings, will re-activate when case frequency increases
  • Disk deployment (level 2) ongoing testing with Viglen. Increasing likelihood that we will escalate to L3 if no progress soon.
  • Machine room air-conditioning. Now level 2.
  • Water leak


Swine Flu Response Plan

See: https://wiki.e-science.cclrc.ac.uk/web1/bin/view/EScienceInternal/TierOneSwineFlu

Purchasing and Finance

  • GRIDPP finalising high level spend plan.
  • Disk tender at ITT evaluation stage.
  • CPU PQQ at evaluation stage.
  • Tape drive purchases being planned.
  • Beginning to build spend plan.

Staffing

  • Second experiment support post, start date 5th October.

PMB Experiment Reports

ATLAS

CMS

Tier-1 OK

LHCB

Hardware Deployment Report

1. Disk servers deployed last week: * 14 for atlasSimStrip * 5 for atlasHotDisk

2. Disk server deployment procedure has been finalized and tested by Tiju. From now on we are ready for real disk server deployment using kickstart and Puppet.

3. SL5-64bit kickstart has been requested from Fabric Team which needs to be tested and will be probably used to deployed new disk servers.

4. Deployment Rota (28/09 - 02/10): * FabMon: Martin * DeputyFabMon: James T. * DepMon: Chris * DeputyDepMon: Shaun


Team Reports

Fabric

RAL Tier1 weekly operations Fabric 20090928

Grid Services

http://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_Grid_20090928

CASTOR

http://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor_28/09/2009

Database

http://www.gridpp.ac.uk/wiki/Operations_Report_28/09/2009

Production

Production Team Report 2009-09-28