Production Team Report 2010-11-22

From GridPP Wiki
Revision as of 13:30, 22 November 2010 by John kelly (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

RAL Tier1 Production Team Report for 22th November 2010.

AoD This Week

Mon - Wed: John Wed: Catalin Thurs - Fri: John

Last week

  • Gareth: AoD(1 day)
  • john: CMS castor 219 migration and some scripting.
  • Tiju: AoD(4days)

Changes to Operating procedures

  • None

Declared Outages in GOC DB

  • Tue - Thu 22-25 Nov. re-install of lcgCE08 as a creamCE.
  • Tue 23rd Nov. Tape System Unavailable. Work on tape robot to resolve problem with power supply cooling.
  • Wed 24 Nov. lcgfts Switch over to using the gLite3.2 Web Service
  • Mon 6th - Wed 8th Dec upgrade of Altas castor instance.

Advanced Warning

  • TODAY and later this week - add new SRMs for Atlas
  • Remaining Castor: Current scheduling (T.B.C) on following dates:
  • Weekend 11/12 Dec: Power outage in Atlas building.
  • Monday 13th December - UPS test.

Other Changes

  • Fabric:
    • Tape Drive Microcode Update.
    • Double the network link to the tape robot stack (stack 12), postponed from the last TS. (Requires Castor stop).
    • Swap out the older of the pair of SAN switches in the Tier1 Oracle databases for its new replacement. (Requires FTS, LFC, 3D stop).
  • Database:
    • Re-visit non-Castor database multipathing
    • Increase shared memory for OGMA, LUGH & SOMNUS
  • Grid Services:
    • Changes to increase resilience of the BDII service
    • Quattorised gLite 3.2 LB nodes being put into production
    • Quattorisation of FTS Web Service hosts
  • Castor:
    • Change ATLAS castor permissions to prevent users deleting data
  • Networks:
    • None