Production Team Report 2010-01-18
From GridPP Wiki
Revision as of 14:19, 18 January 2010 by Gareth smith (Talk | contribs)
Contents
RAL Tier1 Production Team Report for 18th January 2010.
Again, thanks to everyone for keeping the systems going despite the snowy weather.
AoD This Week
Mon-Tues: John Wed: Catalin Thu: Gareth Thu-Fri: John
Last Week (21-24 December)
- Gareth: AoD (bit less than 1 day), Scheduling and rescheduling changes for January.
- John: AoD (3+ days). Checksumming files (for LHCb FSPROBE errors), Script to enable Dashboard to query Overwatch (to report servers in intervention).
- Tiju: A/L.
Changes to Operating procedures
- None
Declared Outages in GOC DB
- At Risk on Castor for RAC noes memory upgrade (18th Jan 09:00 - 22nd Jan 16:00)
- SRM updates: Tuesday (GEN, CMS, LHCb), Wednesday (Atlas) (At Risk).
- Big Intervention on 27/29th (Wed & Thursday).
- Stop batch from 20:00 on 24th Jan (Sunday) to 28th 17:00.
- FTS outage 27th (07:00 - 19:00)
- LFC Outage 27th (08:00 - 19:00)
- Castor outage 27th 08:00 - 28th 17:00.
More details of proposed timetable for the changes within those time windows on internal Wiki at:
https://wiki.e-science.cclrc.ac.uk/web1/bin/view/EScienceInternal/January2010Plans
The following are expected to be added to the GOC DB:
- Tuesday 26th: Migration of 3D databases back to EMC disk arrays. Essentially work for the database team, but could (if problems) interfere with strategy meeting?
- CIP update Thursday 14:30 - 15:30.
- Grid Services nodes (kernel updates) "At Risk" Wednesday 27th.
- But LFC front ends "At Risk" Thursday 28th 12:00-14:00
Not yet scheduled in:
- At Risk for Castor Atlas & LHCb for replacing RAC node (replace cdbc08 with cdbe07).