Difference between revisions of "RAL Tier1 weekly operations Grid 20110509"
From GridPP Wiki
(No difference)
|
Latest revision as of 12:50, 10 May 2011
Contents
Operational Issues
Description | Start | End | Affected VO(s) | Severity | Status |
---|
Downtimes
Description | Hosts | Type | Start | End | Affected VO(s) |
---|
Blocking Issues
Description | Requested Date | Required By Date | Priority | Status |
---|---|---|---|---|
Developments/Plans
Highlights for Tier-1 Ops Meeting
Highlights for Tier-1 VO Liaison Meeting
Detailed Individual Reports
Alastair
- A/L
Andrew
- April UB schedule, metrics [Done]
- Updated APEL Nagios check (add check of APEL sync test) [Done]
- lcgfts01 OS kernel/errata update [Done]
- Old diskserver removal/draining; removal of cmsWanout; adding diskservers to cmsFarmRead [Ongoing]
- Looked into recent CMS problems [Done]
- Updated FTS Monitor to 1.5.3 [Done]
- Fixing problems with cmsUnmerged plots in castormon [Ongoing]
Catalin
- work on BDII stability [ongoing]
- involved with CREAM CEs installation and configuration [ongoing]
- update glite LFC [ongoing]
- work on quattorised ATLAS Frontier installation [stalled]
- work on non-LHC WMS stability
Derek
- Catching up after A/L [done]
- Investigating issues with lcgce08 [done]
- Incorporating mysql tuning params for CREAM CEs into quattor [done]
- Change control for Quatt'ing lcgce03 [done]
- Trying to get IPMI ip address for services hosts resolved [in progress]
- Documentation [ongoing]
- Moving to 50% Tier 1 on Thursday 12th
VO Reports
ALICE
- large amount of user jobs (~24k out of 26k); efficiency irrelevant, stability of services more important
ATLAS
CMS
- Reprocessing ongoing at all Tier-1s (a lot is still to come...)
- CMS now using 3 GB queue
LHCb
OnCall/AoD Cover
- Primary OnCall:
- Grid OnCall: Derek (Mon-Sun)
- AoD: