Production Team Report 2010-10-04
From GridPP Wiki
Contents
RAL Tier1 Production Team Report for 4th October 2010.
AoD This Week
Mon - Tue: John Wed: Gareth Thu-Fri: John
Last week
- Gareth: AoD(1 Day), Some planning for Atlas (non-)outage
- John: Updating Nagios tests during Castor upgrade. Dump of Wiki.
- Tiju: AoD (4days)
Changes to Operating procedures
- Note new number to contact Network Team out of hours.
https://wiki.e-science.cclrc.ac.uk/web1/bin/view/EScienceInternal/CalloutNetworking
Declared Outages in GOC DB
- CMS VO boxes (lcgvo0428 & lcgvo0599) down for decommissioning.
Advanced Warning
- Monday 18th October - R89 Transformer Checks.
- Wednesday 20th October - UPS maintenance.
- Monday 13th December - UPS test.
- Some kernel updates will be required.
- Remaining Castor upgrades almost certainly on following dates:
- Upgrade Gen (including ALICE) - during the week beginning 25 October
- Upgrade CMS - during the week beginning 8 November
- Upgrade ATLAS - during the week beginning 22 November
Other Changes
- Fabric:
- Double the network link to the tape robot stack (stack 12), postponed from the last TS. (Requires Castor stop).
- Swap out the older of the pair of SAN switches in the Tier1 Oracle databases for its new replacement. (Requires FTS, LFC, 3D stop).
- New kernels and glibc updates on non-castor Oracle RAC nodes. (Done for LUGH).
- Update firmware in RAID controller cards for a batch of disk servers.
- Database:
- Re-visit non-Castor database multipathing
- Grid Services:
- New Quattorized front ends for FTS.
- Rolling update to Top-BDII nodes to fix disk partitioning layout.
- Castor:
- Possible SRM update
- Castor 2.1.9 upgrades
- Networks:
- None