RAL Tier1 weekly operations castor 30/11/2009

From GridPP Wiki
Jump to: navigation, search

Summary of Previous Week

  • New Quattor configuration works, also with individual files don't have to rebuild RPMs (Richard)
  • Gave access to Gen for non-LHC evaluation of CASTOR (Shaun)
  • Setting T2K up (Shaun)
  • MICE configuration - waiting for Henry to test (Shaun)
  • Liaising for ATLAS during data-taking (Brian)
  • Got PHP rpm from CERN and added to our repository (Chris)
  • CoC duties (Chris)
  • Polymorphic servers and improving resilience on central servers (Chris, Shaun)
  • Repacking bad tapes (Tim)
  • Assisting DB team test failover on Vulcan (Cheney)
  • TSBN stats (Cheney)
  • Successfully test restored 2 servers from backup, following documentation (Cheney)
  • Finalizing disaster recovery document (Matthew)
  • Chasing strategic objectives (Matthew)

Developments for this week

  • Polymorphic servers and improving resilience on central servers (Chris, Shaun)
  • Configuring new (non-LHC) CIP instance for publishing T2K information (Jens)
  • SRM development (Shaun)
  • Testing builds (Richard)
  • Nagios monitoring scripts (Cheney)
  • Investigate lhcbUser D2D copy problems (Matthew)

Operations Issues

  • Memory issue on ATLAS SRM database - caught in time and service redistributed across two nodes

Blocking issues

none

Planned, Scheduled and Cancelled Down Times

none

Advanced Planning

  • Gen upgrade to 2.1.8 2010Q1
  • Black and White lists will *not* now be introduced on ATLAS until 2.1.8
  • Install/enable gridftp-internal on Gen (This year/before 2.1.8 upgrade)

Staffing

  • Castor on Call person: Shaun