Production Team Report 2010-10-11

From GridPP Wiki
Jump to: navigation, search

RAL Tier1 Production Team Report for 11th October 2010.

AoD This Week

Mon - Wed: Gareth Thu: John Fri: Tiju

Last week

  • Gareth: AoD(1 Day)
  • John: AoD (4days)
  • Tiju:

Changes to Operating procedures

  • None.

Declared Outages in GOC DB

  • CE01 currently down for upgrade to Quattorized CREAM CE on SL5. (Will be replaced by CE09).
  • Tuesday 12th October. At Risk on site for network reset during 'Network At Risk' period.
  • Tuesday 12th October. At Risk on Site-BDIIs for reboot to update kernels/
  • Thursday 14th - Outage on LFC/FTS & 3D for kernel updates to RAC nodes.
  • Monday 18th October - R89 Transformer Checks.
  • Wednesday 20th October - UPS maintenance.

Advanced Warning

  • Monday 13th December - UPS test.
  • Some more kernel updates may be required.
  • Remaining Castor upgrades almost certainly on following dates:
    • Upgrade Gen (including ALICE) - during the week beginning 25 October
    • Upgrade CMS - during the week beginning 8 November
    • Upgrade ATLAS - during the week beginning 22 November

Other Changes

  • Fabric:
    • Double the network link to the tape robot stack (stack 12), postponed from the last TS. (Requires Castor stop).
    • Swap out the older of the pair of SAN switches in the Tier1 Oracle databases for its new replacement. (Requires FTS, LFC, 3D stop).
    • Update firmware in RAID controller cards for a batch of disk servers.
  • Database:
    • Re-visit non-Castor database multipathing
  • Grid Services:
    • None.
  • Castor:
    • Possible SRM update
  • Networks:
    • None