Production Team Report 2011-01-24

From GridPP Wiki
Jump to: navigation, search

RAL Tier1 Production Team Report for 24th January 2011.

AoD This Week

Mon: Gareth Tue - Wed: Tiju Thu: Gareth Fri: Tiju

Last week

  • Gareth: AoD (2 days); More Post Mortems on Disk servers,
  • John: A/L
  • Tiju: AoD (3 days); Updates to puppetmaster monitoring, configuring site-nagios

Changes to Operating procedures

  • None

Declared Outages in GOC DB

  • Monday/Tuesday 31-Jan to 1-Feb:
    • Castor database updates (10.2.0.5)
    • CMS Disk servers to 64-bit
    • Networks
      • vlans configuration update
      • Double the network link to the tape robot stack (stack 12)
    • Application of kernel update to batch server.

Advanced Warning

  • Thurssday 3rd Feb. Puppetmaster (and clients) update

Other Changes

  • Fabric:
    • Addition of additional gateway address to enable additional IP range.
    • Swap out the older of the pair of SAN switches in the Tier1 Oracle databases for its new replacement. (Requires FTS, LFC, 3D stop).
  • Database:
    • Oracle 10.2.0.5 upgrade. (Will do after CERN has done updates to like databases).
    • Re-visit non-Castor database multipathing
    • Increase shared memory for LUGH & SOMNUS.
  • Grid Services:
    • Changes to increase resilience of the BDII service
  • Castor:
    • Upgrade GEN Disk Servers to 64-bit - date TBD.
    • Change ATLAS castor permissions to prevent users deleting data
  • Networks:
    • None