Production Team Report 2010-01-25

From GridPP Wiki
Jump to: navigation, search

RAL Tier1 Production Team Report for 25th January 2010.

Again, thanks to everyone for keeping the systems going despite the snowy weather.

AoD This Week

Mon-Tues: John Wed: Tiju Thu: Gareth Thu-Fri: John

Last Week (21-24 December)

  • Gareth: AoD (1 day), Scheduling intervention, metrics, planning awayday, contributing to GridPP4 bid.
  • John: AoD (3 days), testing updated fetch-crl script, helped Fabric with diskservers, found solution to chkrootkit issues, looked at nagios (added disk servers) and dashboard.
  • Tiju: A/L.

Changes to Operating procedures

  • None

Declared Outages in GOC DB

  • Big Intervention on 27/29th (Wed & Thursday).
    • Stop batch from 20:00 on 24th Jan (Sunday) to 28th 17:00.
    • FTS outage 27th (07:00 - 19:00)
    • LFC Outage 27th (08:00 - 19:00)
    • Castor outage 27th 08:00 - 28th 17:00.
    • Grid services nodes At Risk (kernel updates) Wednesday 27th.
    • LFCs At Risk (front end kernel updates) Thursday 28th. 12:00-12:00
  • 3D Migration Monday 1st February.

More details of proposed timetable for the changes within those time windows on internal Wiki at:

 https://wiki.e-science.cclrc.ac.uk/web1/bin/view/EScienceInternal/January2010Plans

The following are expected to be added to the GOC DB:

Tuesday 9th Feb: At Risk / outage for network intervention.

Not yet scheduled in:

  • At Risk for Castor Atlas & LHCb for replacing RAC node (replace cdbc08 with cdbe07).
  • CIP updates.