RAL Tier1 weekly operations castor 13/07/2009

From GridPP Wiki
Revision as of 11:35, 14 July 2009 by Matt viljoen (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Summary of Previous Week

  • Moving old tape robot and SAN (Tim)
  • Getting CASTOR tape drives running (14/16 available) (Tim)
  • Certification of 2.1.7-27 with new LSF configuration (Chris)
  • Relocating Certification and Preprod instances (Cheney)
  • Repairing blown PSU's (Cheney)
  • SRM Development (Shaun)
  • ATLAS intervention (Shaun&Matt&Gordon&Keir&Eter)

Developments for this week

  • 2.1.7-27 upgrade + LSF reconfiguration (Chris,all)
  • Testing old tape robots and moving non-castor tape drives (Tim)
  • SRM development (Shaun)
  • CIP development (Jens)

Ongoing

  • Cleaning up database for a future 2.1.8 upgrade (Shaun)
  • Setting up Preproduction (Matt)
  • Prepare preproduction platform for stress testing (Chris/Matt)
  • adding virtual disk servers to preproduction (Matt)
  • 2.1.7-27 upgrade preparation - testing synchronisation and kernel upgrades (Chris)

Operations Issues

  • Network intervention caused lsf daemon to die on some disk servers.
  • Some tape drives reporting errors; cleared by Tim.
  • ATLAS database problem led to extended unscheduled downtime.

Blocking issues

none

Scheduled and Cancelled Down Times

Entries in/planned to go to GOCDB

Description Start End Type Affected VO(s)
Apply Oracle BigID patch 13/7/09 0800 13/7/09 1700 At Risk All
2.1.7-27 upgrade and LSF reconfiguration 14/7/09 0800 14/7/09 1700 Downtime All
2.1.7-27 upgrade and LSF reconfiguration 14/7/09 0700 15/7/09 1700 At Risk All

Changes to Operational Milestones

Description Changed Status
Apply Oracle BigID fix to fix (H) DB team DONE

Advanced Planning

  • SRM 2.8 upgrade (sometime during July)
  • Start using Black and White lists (sometime during July?)
  • CIP upgrade to include nearline publishing (sometime during July/August)
  • Upgrade nameserver to 2.1.8 (September?)

Staffing

  • Castor on Call person (is also Castor on Day Duty): Shaun
  • Cheney on training course Mon-Wed