RAL Tier1 weekly operations castor 11/06/2012

From GridPP Wiki
Revision as of 14:37, 11 June 2012 by Matt viljoen (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Operations News

  • Upgraded CIP from CIP-2.1.1-2 to latest version, 2.2.8-1
  • Switched from LSF to TM for ATLAS

Operations Problems

  • (Wed) CMS OPs test started failing after the CIP upgrade due to a SRM misconfiguration (not supporting rfio protocol, which isn't used by CMS) Fixed on Friday.
  • (Fri) Transfer failures in the morning due to FTS problem

Blocking Issues

none

Planned, Scheduled and Cancelled Interventions

Entries in/planned to go to GOCDB

Description Start End Type Affected VO(s) Lead by
2.1.11-9 upgrade 13/06/12 09:00 13/06/12 14:00 Downtime All Matthew
ORACLE 11g upgrade (STC) 27/06/12 09:00 27/06/12 17:00 Downtime All Rich

Advanced Planning

Tasks

  • Test and certify 2.1.12-4 (Matthew, Chris)
  • Re-instantiate certification on HyperV VMs using Quattor+Puppet (Rob)
  • Selection of disk-only prototype solution (Shaun, Rob, Brian, James)

Interventions

  • Upgrade repack to 2.1.12-4 (Jun)
  • Upgrade Castor Facilities and Tier1 instances to 2.1.11-9 (Jul)
  • Upgrade to 2.1.12 on Tier1 instances once we are happy with TM and TG in performance (Jul)

Staffing

  • Castor on Call person: Matthew
  • Staff absence/out of the office:
    • (Mon-Wed) Shaun, Jens at EUDAT, Stockholm