RAL Tier1 weekly operations castor 14/12/2009

From GridPP Wiki
Revision as of 09:06, 14 December 2009 by Matt viljoen (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Summary of Previous Week

  • Developing our January upgrade strategy (All)
  • More polymorphic server work - finished first version(Chris)
  • Concentrate more on preproduction and work which Richard is doing (Chris)
  • SRM development (Shaun)
  • Work on atlas migration problems (Shaun)
  • Work on CMS migration (Shaun)
  • Investigation of timeouts (with cern) (Shaun)
  • Disk server draining for ATLAS and LHCb (Brian)
  • January upgrade strategy (Matthew)
  • Deployed four RAID6 disk servers from atlasNonProd to atlasT0Raw (Matthew)
  • Setting up and testing new production CIP (Matthew)
  • CoD work (Matthew)

Developments for this week

  • Review configuration for new lsf-triplet and run some tests (Chris)
  • Continue working on polymorphic castor head nodes (Chris)
  • Test and switch production LSF triplet to new hardware (Chris)
  • Additional kernel + security updates (Chris)
  • Continue working on Pre-Production instance (Chris)
  • Build of new robot controller (Cheney)
  • CASTOR pre-prod testing(Shaun)
  • More SRM development (Shaun)
  • Continue looking at tape problems thrown up with repack (Tim)
  • Finalizing CIP 2.1.0 testing and released to CERN, CNAF, and ASGC (Jens)
  • Setting up new CIP instance for T2K etc. (Jens)
  • Setting up and testing new production CIP (Matthew)
  • CASTOR for facilities planning (Matthew)

Ongoing work

  • Investigate lhcbUser D2D copy problems (Matthew)
  • More build of castoradm1 replacement (Cheney)

Operations Issues

none

Blocking issues

  • Lack of Quattor configuration files for SLC4.8 is stopping us evaluating Quattor alongside CASTOR 2.1.8. Preprod setup will initially proceed with a Kickstart-based deployment.

Planned, Scheduled and Cancelled Interventions

  • Deploy new CIP for T2K, ASAP (Pending approval)
  • Replace CIP hosting machine with new one with more resilient hardware, after 21/12/09 (Pending approval)
  • Deploy new LSF triplets, 14/01/10 (Pending approval)

Advanced Planning

  • Gen upgrade to 2.1.8 2010Q1
  • Install/enable gridftp-internal on Gen (This year/before 2.1.8 upgrade)

Staffing

  • Castor on Call person: Chris