Production Team Report 2010-05-17

From GridPP Wiki
Jump to: navigation, search

RAL Tier1 Production Team Report for 17th May 2010.

AoD This Week

Mon & Tues: TijuWed: Gareth Thu & Fri: Tiju

Last Week (10 May - 17 May)

  • Gareth: AoD (<1 day), APRs, HEP SYSMAN planning, GGUS Alarm ticket meeting, prepared abstract for CHEP.
  • John: AoD (4 days), APR, initiation into blessing Castor disk servers.
  • Tiju: APR, Effect of Nagios replacement for SAM tests, dashboard updates, Nagios config changes(nagger)

Changes to Operating procedures

  • As reported last week: Use of fortnightly Tier1 co-ordination meeting to schedule Tier1s during technical stops. use this meeting as preparation.

Declared Outages in GOC DB

  • Castor Oracle database Sub-request clean-up: CMS - Monday 17th May
  • Castor Oracle database Sub-request clean-up: Atlas - Wednesday 19th May
  • Castor Oracle database Sub-request clean-up: LHCb - Tuesday 25th May
  • CE07 - re-config for glexec - Monday 17th May.
  • At Risk for UPS test - Tuesday 1st June.

Technical Stops

  • ===May 31 - June 2:===
    • Tuesday 1st June: UPS test (site At Risk)
    • Wednesday 2nd June: Oracle quarterly patching (TBC)
  • ===June 28-30===
    • Monday 28th June - Transformer checks (Site At Risk at end of afternoon)- TX2
    • Monday 28th June - Transformer checks (Site At Risk for day) - TX3 or 4
  • ===July 26-28===
    • Transformer checks. (2 days - TX1 & TX3 or 4)

Advanced Warning

  • Fabric:
    • Double the network link to the tape robot stack (stack 12), postponed from the last TS. (Requires Castor stop).
    • Swap out the older of the pair of SAN switches in the Tier1 Oracle databases for its new replacement. (Requires FTS, LFC, 3D stop).
    • Multipath mods to stop errors. (Not yet sure of effect).
    • Microcode update for tape robot
    • Swap Solaris tape controllers (for robot) over (?)
    • New Atlas software server.
    • New kernels and glibc updates on non-castor Oracle RAC nodes. (Done for LUGH). (Added after meeting.)
  • Database:
    • Quarterly Oracle patching
  • Grid Services:
    • CEs will be configured for glexec in rotation.
    • Stop SL4 batch service (August)
    • Add Quatorised BDII to Top-BDII set. (Below threshold for technical stop).
    • Update to FTS 2.2.4. (Below threshold for technical stop).
    • glite 3.2 WMS (Below threshold for technical stop).
  • Castor:
    • Possible SRM update
  • Networks:
    • Commissioning OPN link