RAL Tier1 weekly operations castor 03/06/2013

From GridPP Wiki
Jump to: navigation, search

Operations News

  • A fix was deployed to Facilities which has made the ORA "in-doubt distributed transaction" problem go away. We will this week decide whether to proceed with the Tier 1 2.1.13 upgrades.

Operations Problems

  • (Wed) High load from ATLAS made some atlasStripInput disk servers come under very high load. Up to 100 gridftp transactions were observed, which is surprising as the slot weighting value for gridftp means we can only have a theoretical maximum of 80 transactions. We may also need to increase the weighting for gridftp, especially since the total slots has been increased in the newest generation of disk servers.

Blocking Issues

  • Can't upgrade puppet until someone spends time learning about administering it (to replace Chris) and this may delay an SL6 upgrade

Planned, Scheduled and Cancelled Interventions

Entries in/planned to go to GOCDB none

Advanced Planning

Tasks

  • None

Interventions

  • Upgrade central services (NS,CUPV,VDQM) from 2.1.11-9 to 2.1.13-9
  • Upgrade stagers from 2.1.12 to 2.1.13

Staffing

  • Castor on Call person
    • Matthew
  • Staff absence/out of the office:
    • (Mon-Tue morning) security workshop (Rob, Brian)
    • (Tue afternoon - Wed) HEPSYSMAN (Rob, Brian, Matt?)