RAL Tier1 weekly operations castor 01/07/2013

From GridPP Wiki
Jump to: navigation, search

Operations News

  • CASTOR 2.1.13-9 NS upgrade scheduled for Wednesday 2013/07/03.
  • All Puppet clients now upgraded to 2.7.22.

Operations Problems

  • Callout on Thursday evening - high CPU usage by diskmanagerd on gdss611. This was resolved as not being a CASTOR issue, the actual cause appeared to be a load spike coupled with a lack of irqbalance on some disk servers.
  • Callout early on Friday morning - stagerd started to cause swapping on the ATLAS stager. Restarting the daemon solved the immediate issue, but the cause is undetermined and needs further investigation.

Blocking Issues

  • none

Planned, Scheduled and Cancelled Interventions

Entries in/planned to go to GOCDB

2.1.13-9-2 name server upgrade on Wednesday.

Advanced Planning

Tasks

  • None

Interventions

  • Upgrade central services (NS,CUPV,VDQM) from 2.1.11-9 to 2.1.13-9
  • Upgrade stagers from 2.1.12 to 2.1.13

Staffing

  • Castor on Call person
    • Matt
  • Staff absence/out of the office:
    • None