RAL Tier1 weekly operations castor 17/05/2010

From GridPP Wiki
Revision as of 14:12, 13 August 2010 by Matt viljoen (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Summary of Previous Week

  • Matthew:
    • Reviewing the blessing stage of disk server deployment and logistics of V08/V09 deployment
    • Deploying disk servers
    • Reviewing preprod progress
    • CoD+Depmon work
  • Shaun:
    • Set up super-B VO
    • Some SRM development
    • Cleared up problem repack migrations
    • Investigated T2K migration backlog
  • Chris:
    • ..
  • Richard:
    • Finishing off remaining p/p benchmarks
    • Writing up results from ditto
  • Brian:
    • ..
  • Jens:
    • ..

Developments for this week

  • Matthew:
    • Try to find time to upgrade and test new puppetmaster
    • Deploying disk servers
    • Alternatives to CASTOR
    • CoD+Depmon work
  • Shaun:
    • Disk server blessing
    • SRM development
  • Chris:
    • ..
  • Richard:
    • Preparing changes needed to p/p configuration
  • Brian:
    • ..
  • Jens:
    • ..

Operations Issues

  • T2K have started heavily using Gen with small files, resulting in a big backlog of migration candidates currently sitting on RAID5 servers.
  • Problems are preventing SuperB information being published. Now fixed.
  • Failure of gdss380 caused lhcbMdst to run out of space. Emergency added a disk server over weekend which fixed things.

Blocking issues

None

Planned, Scheduled and Cancelled Interventions

Entries in/planned to go to GOCDB

None

Advanced Planning

  • Upgrade to 2.1.8/2.1.9 2010
  • Upgrade to SRM 2.8-6 after testing is complete
  • ATLAS want to know how much capacity is available in disabled servers (published as Capability). Low priority CIP change to do this.
  • CASTOR Instance for Non LHC 2010Q2
  • Install/enable gridftp-internal on Gen (Before 2.1.8 upgrade)

Staffing

  • Castor on Call person: Matthew
  • Staff absences: