RAL Tier1 weekly operations castor 31/12/2012

From GridPP Wiki
Revision as of 08:44, 3 January 2013 by Matt viljoen (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Operations News

  • none

Operations Problems

  • (Tue) ATLAS started failing a proportion of transfers. ORACLE OCI error appearing in srmbed. The cause was an error returned by ORACLE that the password was about to expire. It was partly fixed by extending the lifetime of the password for the ATLAS SRM
  • (Thu) The same problem as above started affecting the ATLAS Stager schema. Once its password lifetime was extended, the error went away. The lifetime was extended for passwords for all CASTOR schemas to avoid a repeat problem.
  • (Thu) The CIP has stopped producing output, for reasons unknown. The error in the log is: Can't happen: didn't find disk pool: <disk_pool>"

Blocking Issues

none

Planned, Scheduled and Cancelled Interventions

Entries in/planned to go to GOCDB none

Advanced Planning

Tasks

  • Simplify and document Quattor templates to make them easier to maintain
  • Test and certify 2.1.13-5 with simplified Quattor templates

Interventions

  • Upgrade stagers from 2.1.12 to 2.1.13 and central services (NS,CUPV,VDQM) from 2.1.11 to 2.1.13

Staffing

  • Castor on Call person
    • Matthew
  • Staff absence/out of the office:
    • ..