RAL Tier1 weekly operations castor 24/05/2010

From GridPP Wiki
Jump to: navigation, search

Summary of Previous Week

  • Matthew:
    • Deploying disk servers
    • Alternatives to CASTOR
    • CoD+Depmon work
  • Shaun:
    • Deploying disk servers
    • SRM Work
    • Announced super-B availability
    • Investigation of migrated files stil in CANBEMIGR state
    • Upgrade testing co-ordination
  • Chris:
    • Castor 2.1.8/2.1.9 tests
    • Testing SL5 xfs disk server and updating documentation
    • Preparing new puppet manifests for SL5
  • Richard:
    • Wrote up p/p benchmark results into a WIKI page
    • Learnt how to run the functional test suite
    • Contributed to "solving the Mystery of the Incredible Vanishing Resource BDII"
  • Brian:
    • Consistency check of ATLAS
  • Jens:
    • With help from Fabric and Services, solving the Mystery of the Incredible Vanishing Resource BDII

Developments for this week

  • Matthew:
    • More deployment of disk servers, blessing etc.
    • Depmon duties
    • Upgrading new puppet master if I have time
  • Shaun:
    • DEVELOPMENT
  • Chris:
    • Castor 2.1.8/2.1.9 tests
    • Deploying SL5 xfs disk server and updating documentation if needed
    • Rolling out new puppet manifests for SL5
    • Castor on duty
  • Richard:
    • Possible fallout items from "The Mystery of the Incredible Vanishing Resource BDII"
    • P/P upgrade items
    • Add more "bracketing" to report on p/p benchmarking
  • Brian:
    • ..
  • Jens:
    • The Mythical CIP Month

Operations Issues

  • Misconfiguration on gdss458 resulted in a number of ROOT jobs failing. This was fixed on Wednesday. Documentation updated
  • CASTOR resource BDIIs started publishing empty output from Wednesday evening to Thursday afternoon - CIP was producing output but the BDIIs published nothing

Blocking issues

None

Planned, Scheduled and Cancelled Interventions

Entries in/planned to go to GOCDB

None

Advanced Planning

  • Upgrade to 2.1.8/2.1.9 2010
  • Upgrade to SRM 2.8-6 after testing is complete
  • ATLAS want to know how much capacity is available in disabled servers (published as Capability). Low priority CIP change to do this.
  • CASTOR Instance for Non LHC 2010Q2
  • Install/enable gridftp-internal on Gen (Before 2.1.8 upgrade)

Staffing

  • Castor on Call person: Chris
  • Staff absences:
    • Chris working at home Thurs pm