RAL Tier1 weekly operations Overview 20100315

From GridPP Wiki
Revision as of 16:53, 17 March 2010 by Alastair dewhurst (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Production Managers Report

Planned interventions and other operational issues Production Team Report 2010-03-15

LHC Schedule and Experiment Issues

https://wiki.e-science.cclrc.ac.uk/web1/bin/view/EScienceInternal/LhcSchedule

(Link updated 17/3/10. Old link was: https://wiki.e-science.cclrc.ac.uk/web1/bin/view/Sandbox/LhcSchedule)

Changes

Approved Changes

  • enabling glexec on Worker Nodes (Team Leader)
  • Update FTS to version 2.2.3 (Change Team)

Other considered changes

  • Migrate Nagios slave servers (waiting feedback from Jonathan)
  • T10KA/B microcode changes (waiting feedback from tim)

Pending scheduling

  • Application of ORACLE January PSU
  • Move CMS to T10KB drives
  • Upgrade SL5 workernodes to SL 5.4
  • Switch to Quattor as the deployment mechanism for New disk servers
  • Cleaning of non-LHC LFC schema

Reviewed Changes

  • ATLAS CPU/wall limits (Good, Successful)
  • New Service Class for CMS (Good, Problematic)
  • DBSERVMON Installation (Good, Successful)#
  • CASTOR Database - Clusterware Reconfiguration (Good, Successful)
  • Reduce ATLAS LSF Cleaning period to 14400 seconds (Good, Successful)
  • Change control for ORACLE security risk (Good, successful)

Team Reports

Fabric

RAL Tier1 weekly operations Fabric 20100315

Grid Services

http://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_Grid_20100315

CASTOR

http://www.gridpp.ac.uk/wiki/RAL_Tier1_weekly_operations_castor_15/03/2010

Database

http://www.gridpp.ac.uk/wiki/Operations_Report_15/03/2010