RAL Tier1 weekly operations castor 04/03/2013

From GridPP Wiki
Revision as of 14:48, 5 March 2013 by Matt viljoen (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Operations News

  • 2.1.13 stress testing underway. Will also be using the new SL6 WNs this week.
  • All remaining production tape servers now upgraded to 2.1.13-9
  • Currently setting up a new admin VM dedicated to running tape verification
  • Change to TCP for rsyslog monitoring now being rolling out

Operations Problems

  • (Sun) Gen SRM problems apparent as a memory leak, which only seems to affect Gen. A monthly restarter as been implemented.

Blocking Issues

  • Can't upgrade puppet until someone spends time learning about administering it (to replace Chris) and this may delay an SL6 upgrade

Planned, Scheduled and Cancelled Interventions

Entries in/planned to go to GOCDB none

Advanced Planning

Tasks

  • Test and certify 2.1.13-9 with simplified Quattor templates
  • Turn off Amanda backups

Interventions

  • Upgrade tape servers to 2.1.13-9
  • Upgrade central services (NS,CUPV,VDQM) from 2.1.11-9 to 2.1.13-9
  • Upgrade stagers from 2.1.12 to 2.1.13

Staffing

  • Castor on Call person
    • Matthew
  • Staff absence/out of the office:
    • ..