RAL Tier1 weekly operations Grid 20100104
From GridPP Wiki
Revision as of 15:59, 4 January 2010 by Andrew lahiff (Talk | contribs)
Contents
Summary of Previous Week
Developments
N/A
Operational Issues and Incidents
Description | Start | End | Affected VO(s) | Severity | Status |
---|---|---|---|---|---|
New jobs not accepted on WMS01 and WMS02 | Sat 2 Jan 2010 | Sat 2 Jan 2010 | LHC VOs | Medium | Problematic ICE jobs removed, max no of jobs increased to 3000 |
Atlas jobs hitting wallclock limit | 27 Dec | 28 Dec | Atlas | Medium | Atlas jobs were hitting the (normalised) 72 hours wallclock limit, due to what appears to be slow transfers out of Castor, Wallclock limit was raised to 100 hours on the grid3000M queue |
Plans for Week(s) Ahead
Plans
- Alastair
- Prepare slides and run Hammer Cloud test for ATLAS UK meeting in Cambridge.
- Away at ATLAS UK meeting 6th - 8th January.
- Andrew
- Prepare December UB schedule document
- Finish plan for KSI2K to HEP-SPEC06 accounting migration
- Finish various docs
- Catalin
- SL5 LHCb VOBOX installation
- work on MySQL migration
- follow up issue with t2k.org 'zero size' LFC entries
- decomission old SL4 ALICE VOBOXes
- Derek
- Testing glExec
- Testing helpdesk restore
- Verifying Cream CE installation instructions
- Matt
- Richard
- Finish plan for BDII changes
- Write discussion document for DNS proposal
- Continue building the servers for the CASTOR pre-prod instance
- Mayo
- Encrypting passwords for Metric system
- Automating Metric report system
- Tape statistics spreadsheet project
Resource Requests
Downtimes
Description | Hosts | Type | Start | End | Affected VO(s) |
---|---|---|---|---|---|
Requirements and Blocking Issues
Description | Required By | Priority | Status |
---|---|---|---|
LHCb SL5 64bit VOBOX deployment using Quattor | 25 Nov 2009 | Medium | Quattor recipe not yet available (RT#53392) |
Hardware for testing LFC/FTS resilience | High | DataServices want to deploy a DataGuard configuration to test LFC/FTS resilience; request for HW made through RT Fabric queue | |
Hardware for PPS | High | We have made a commitment to test PPS pre-releases, and have no hardware dedicated for this. | |
Hardware for Grid Services testbed | Medium |
OnCall/AoD Cover
- Primary OnCall: Catalin (Mon-Sun)
- Grid OnCall:
- AoD: