RAL Tier1 weekly operations castor 05/08/2013

From GridPP Wiki
Revision as of 15:43, 5 August 2013 by Matt viljoen (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Operations News

  • The last stager (Gen) successfully upgraded to 2.1.13
  • Repack stager upgraded to 2.1.13. Hosts will be upgraded next week.
  • 2.1.13-9-2 hotfix successfully transparently applied to Facilities stager schema
  • lcgsrm01 transparently removed from srm-cms alias and RAID critical firmware updates applied

Operations Problems

  • ATLAS was brought down for some 0.5h during the Gen 2.1.13 upgrade, due to inadequate isolation within DB operating procedures. A new procedure has been developed which will be used in future upgrades.
  • ATLAS data deletion problems continue. They also appear with RAL Tier 2, indicating it could be a site networking issue.
  • Still sometimes seeing stuck draining. The problem will be investigated and reported back to CERN.

Blocking Issues

  • none

Planned, Scheduled and Cancelled Interventions

Entries in/planned to go to GOCDB

  • none

Advanced Planning

Tasks

  • CASTOR 2.1.14 + SL6 testing, once 2.1.14 is released.

Interventions

  • none

Staffing

  • Castor on Call person
    • Rob
  • Staff absence/out of the office:
    • Matthew A/L from Friday