RAL Tier1 weekly operations castor 31/05/2010

From GridPP Wiki
Revision as of 15:33, 28 May 2010 by Chris kruk (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Summary of Previous Week

  • Matthew:
    • More deployment of disk servers, blessing etc.
    • Depmon duties
    • CASTOR DB: Way Forward
  • Shaun:
    • ..
  • Chris:
    • Castor 2.1.8/2.1.9 tests
    • Deploying SL5 xfs disk server and updating documentation
    • Deployment of disk servers, blessing etc.
    • Rolling out new puppet manifests for SL5
    • Castor on duty
  • Richard:
    • ..
  • Brian:
    • Stageout files on gdss272
    • Development of method for listing all files in castor
    • FTS Case Sensitivity checksum
  • Jens:
    • Post-CASTOR-BDII-failure-follow-up-meeting

Developments for this week

  • Matthew:
    • Annual leave
  • Shaun:
    • ..
  • Chris:
    • Castor 2.1.8/2.1.9 tests
    • Deployment of disk servers, blessing etc.
    • Castor on duty
    • DepMon on duty
  • Richard:
    • ..
  • Brian:
    • ..
  • Jens:
    • CIP stuff?

Operations Issues

  • Atlas SRM logs filled up partition on 24/5/10 resulting in a callout
  • CMS stager died at 0805 on 25/5/10 and was restarted at 0937. No clue as to why in the logs.

Blocking issues

None

Planned, Scheduled and Cancelled Interventions

Entries in/planned to go to GOCDB

None

Advanced Planning

  • Upgrade to 2.1.8/2.1.9 2010
  • Upgrade to SRM 2.8-6 after testing is complete
  • ATLAS want to know how much capacity is available in disabled servers (published as Capability). Low priority CIP change to do this.
  • CASTOR Instance for Non LHC 2010Q2
  • Install/enable gridftp-internal on Gen (Before 2.1.8 upgrade)

Staffing

  • Castor on Call person: Chris
  • Staff absences:
    • Matthew on annual leave all week