RAL Tier1 weekly operations castor 25/11/2013

From GridPP Wiki
Jump to: navigation, search

Operations News

  • Have decreased CASTOR overhead to 1% on 5 production disk servers. We will wait until they fill up and monitor them closely before rolling out the change to everything.
  • Testing started on 2.1.14-3. Some problems with xroot testing. Now 2.1.14-4 (which includes some xroot bug fixes) is out, we'll use that instead

Operations Problems

  • Some tape-backed LHCb inconsistent files were discovered due to a race condition that happens under certain conditions between the stager and GC: https://savannah.cern.ch/bugs/index.php?103235. We have increased the GC limit in the meantime which should reduce chance of repetitition.
  • (Mon) Problems switching to backup link at CERN caused all SUM tests to fail between 0900-1200CET.

Blocking Issues

  • none

Planned, Scheduled and Cancelled Interventions

Entries in/planned to go to GOCDB

  • none

Advanced Planning

Tasks

  • CASTOR 2.1.14 + SL5/6 testing

Interventions

  • none

Staffing

  • Castor on Call person
    • Rob
  • Staff absence/out of the office:
    • Mostly in.