RAL Tier1 weekly operations Grid 20100104

From GridPP Wiki
Jump to: navigation, search

Summary of Previous Week

Developments

N/A

Operational Issues and Incidents

Description Start End Affected VO(s) Severity Status
New jobs not accepted on WMS01 and WMS02 Sat 2 Jan 2010 Sat 2 Jan 2010 LHC VOs Medium Problematic ICE jobs removed, max no of jobs increased to 3000
Atlas jobs hitting wallclock limit 27 Dec 28 Dec Atlas Medium Atlas jobs were hitting the (normalised) 72 hours wallclock limit, due to what appears to be slow transfers out of Castor, Wallclock limit was raised to 100 hours on the grid3000M queue

Plans for Week(s) Ahead

Plans

  • Alastair
    • Prepare slides and run Hammer Cloud test for ATLAS UK meeting in Cambridge.
    • Away at ATLAS UK meeting 6th - 8th January.
  • Andrew
    • Prepare December UB schedule document
    • Finish plan for KSI2K to HEP-SPEC06 accounting migration
    • Finish various docs
  • Catalin
    • SL5 LHCb VOBOX installation
    • work on MySQL migration
    • follow up issue with t2k.org 'zero size' LFC entries
    • decomission old SL4 ALICE VOBOXes
  • Derek
    • Testing glExec
    • Testing helpdesk restore
    • Verifying Cream CE installation instructions
  • Matt
  • Richard
    • Finish plan for BDII changes
    • Write discussion document for DNS proposal
    • Continue building the servers for the CASTOR pre-prod instance
  • Mayo
    • Encrypting passwords for Metric system
    • Automating Metric report system
    • Tape statistics spreadsheet project

Resource Requests

Downtimes

Description Hosts Type Start End Affected VO(s)

Requirements and Blocking Issues

Description Required By Priority Status
LHCb SL5 64bit VOBOX deployment using Quattor 25 Nov 2009 Medium Quattor recipe not yet available (RT#53392)
Hardware for testing LFC/FTS resilience High DataServices want to deploy a DataGuard configuration to test LFC/FTS resilience; request for HW made through RT Fabric queue
Hardware for PPS High We have made a commitment to test PPS pre-releases, and have no hardware dedicated for this.
Hardware for Grid Services testbed Medium

OnCall/AoD Cover

  • Primary OnCall: Catalin (Mon-Sun)
  • Grid OnCall:
  • AoD: