Production Team Report 2010-07-12
From GridPP Wiki
Contents
RAL Tier1 Production Team Report for 14th July 2010.
AoD This Week
Mon: John Tues: Gareth Wed: John Thu: Tiju Fri: John
Last two weeks
- Gareth: AoD (3 day), WLCG workshop (preparation of talk), Post Mortem of CMS data loss, safe manual handling course.
- John: AoD (5 days), WLCG workshop, looking at Nagios checks for Castor 2.1.9.
- Tiju: A/L (last two weeks)
Changes to Operating procedures
- Note: Further changes to disk server intervention procedure - extra check between Fabric & Castor representatives before action that would delete data.
Declared Outages in GOC DB
- July 19-22 NEW LHC Technical Stop Dates. Transformer checks. (Site At Risk). TX1 & TX4. At Risk on whole Tier1 from 08:30 on Monday 19th to 17:00 on Thursday 22nd July. (May not take place as planned!)
Advanced Warning
- Today (12th July) - WAN tuning on LHCb disk servers.
- Thursday 15th July (tbc): Multipath configuration update for OGMA (10 - 2)
- Monday 19th July (tbc): Multipath configuration update for LUGH (10 - 2)
- Tuesday 20th July (tbc): Multipath configuration update for OGMA (10 - 2)
- Monday 19th July: New robot controller brought into use.
- Tuesday 20th July: Microcode update for tape robot
- 2nd August: Stop SL4 batch service. (Turn off SL3 UIs)
- Not yet scheduled: Restrict access to FTS & LFC databases via Oracle 'ACLs'
Other Changes
- Fabric:
- Double the network link to the tape robot stack (stack 12), postponed from the last TS. (Requires Castor stop).
- Swap out the older of the pair of SAN switches in the Tier1 Oracle databases for its new replacement. (Requires FTS, LFC, 3D stop).
- New kernels and glibc updates on non-castor Oracle RAC nodes. (Done for LUGH).
- Database:
- None
- 'Grid Services:
- Add Quatorised BDII to Top-BDII set. (Below threshold for technical stop).
- Castor:
- Possible SRM update
- Networks:
- Commissioning OPN link