Production Team Report 2010-04-26

From GridPP Wiki
Jump to: navigation, search

RAL Tier1 Production Team Report for 26th April 2010.

AoD This Week

Mon & Tues: John Wed: Gareth Thu: James Fri: John

Note: production team training All Day Thursday 29th April.

Last Week (19 April - 26 April)

  • Gareth: AoD (<1 day), Planning this week's interventions, started APRs., Started planning HEP SYSMAN.
  • John: AoD (<1 day), updated fetch-crl script, reported on gdss92 problem, set-up downtime reporting script on nagger, SSC training.
  • Tiju: AoD (4 days), Updated Dashboard (addition of Castor usage),SSC training.

Changes to Operating procedures

  • None

Declared Outages in GOC DB

  • For Wednesday 28th April (starting at 10:30 for main work).
    • Re-balancing of SOMNUS database
    • Add switch into stack in UPS room.
    • Add backup node to Castor SAN.
    • Note - Note yet declared but possible update to Atlas software server.

Advanced Warning

  • Thursday 29th: Add 32-bit Castor libraries to SL5 worker nodes. ("At Risk" on CEs)
  • Next week:
    • Oracle patching on LUGH, OGMA & SOMNUS.
    • Adding GLEXEC and middleware updates to one of the CEs. (Other CEs in subsequent weeks)
  • Still to do:
    • Fix glibc mismatch on OGMA database. Apply kernel & glibc updates to LUGH & SOMNUS.