RAL Tier1 weekly operations castor 11/04/2011

From GridPP Wiki
Revision as of 08:49, 15 April 2011 by Matt viljoen (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Operations News

  • CMS are trialling a configuration whereby cmsFarmRead is used to access files instead of cmsWanOut, removing the need to do mass disk2disk copying.

Operations Issues

none

Blocking Issues

  • Lack of production-class hardware running ORACLE 10g needs to be resolved prior to CASTOR for Facilities going into full production. Has arrived and we are awaiting installation.

Planned, Scheduled and Cancelled Interventions

Entries in/planned to go to GOCDB

Description Start End Type Affected VO(s)
Upgrade LHCb SRMs to 2.10-2 12 April 11:00 12 March 13:00 Downtime LHCb
Upgrade CMS SRMs to 2.10-2 13 April 11:00 13 March 13:00 Downtime CMS
Upgrade ATLAS SRMs to 2.10-2 14 April 11:00 14 March 13:00 Downtime ATLAS
Upgrade Gen SRMs to 2.10-2 15 April 11:00 15 March 13:00 Downtime Gen

Advanced Planning

  • Upgrade of CASTOR clients on WNs to 2.1.10-0
  • Upgrade tape subsystem to 2.1.10-1 which allows us to support files >2TB
  • Move Tier1 instances to new Database infrastructure which with a Dataguard backup instance in R26
  • Upgrade Facilities instance to 2.1.10-0
  • Move Facilities instance to new Database hardware running 10g
  • Upgrade SRMs to 2.10-3 which incorporates
    • VOMS support
  • Start migrating from T10KA to T10KC media later this year
  • Quattorization of remaining SRM servers
  • Hardware upgrade and Quattorization of CASTOR headnodes

Staffing

  • Castor on Call person: Matthew
  • Staff absence/out of the office:
    • Matthew (Mon PM)