Production Team Report 2010-07-26

From GridPP Wiki
Jump to: navigation, search

RAL Tier1 Production Team Report for 26th July 2010.

AoD This Week

Mon: John Tues: John Wed: Tiju Thu: John Fri: John

Last week

  • Gareth: A/L all week
  • John: AoD (1 day), Gareth stand-in, looking at Nagios checks for Castor 2.1.9.
  • Tiju: AoD (4 days), Modem script, Nagios updates, SHE Audit,

Changes to Operating procedures

  • None.

Declared Outages in GOC DB

  • July 26th At risk on lcgCE01 for CMS re-config.
  • July 27th At risk on lcgCE06 and lcgce07 for CMS re-config.
  • July 28th At risk on FTS and LFC for Database security configuration.

Advanced Warning

  • 2nd August: Stop SL4 batch service. (Turn off SL3 UIs)
  • Early August: turn off lcgce02. Currently there is a downtime from 2/8/2010 to 10/08/2010

Other Changes

  • Fabric:
    • Double the network link to the tape robot stack (stack 12), postponed from the last TS. (Requires Castor stop).
    • Swap out the older of the pair of SAN switches in the Tier1 Oracle databases for its new replacement. (Requires FTS, LFC, 3D stop).
    • New kernels and glibc updates on non-castor Oracle RAC nodes. (Done for LUGH).
  • Database:
    • Re-visit non-caster database mulitpathing
  • Grid Services:
    • Add Quatorised BDII to Top-BDII set. (Below threshold for technical stop).
  • Castor:
    • Possible SRM update
  • Networks:
    • Commissioning OPN link