RAL Tier1 weekly operations Grid 20110613
From GridPP Wiki
Revision as of 14:46, 13 June 2011 by Catalin condurache (Talk | contribs)
Contents
Operational Issues
Description | Start | End | Affected VO(s) | Severity | Status |
---|
Downtimes
Description | Hosts | Type | Start | End | Affected VO(s) |
---|---|---|---|---|---|
kickstart -> quattor | lcgce05.gridpp.rl.ac.uk | SD | Fri 10 June | Wed 15 June | non-LHC |
glite3.2 CREAM update | lcgce08.gridpp.rl.ac.uk | SD | Tue 7 June | Fri 10 June | ATLAS, LHCb |
Blocking Issues
Description | Requested Date | Required By Date | Priority | Status |
---|---|---|---|---|
Developments/Plans
Highlights for Tier-1 Ops Meeting
Highlights for Tier-1 VO Liaison Meeting
Detailed Individual Reports
Alastair
- Catching up after several weeks of Annual Leave!!!
- Attended Frontier workshop on Wednesday last week.
- Working from home Thursday + Friday and for the start of this week.
- Working on permission change. Working on fix to pilot submission + dark data cleanup, before scripts can work. Talking to Shaun about new solutions.
- Working on ways to increase efficiency of ATLAS jobs on batch farm.
- Various CVMFS fixes. Including new test and debugging ATLAS jobs.
Andrew
- Setup/tested 100% Quattorized CMS squid [Done]
- Put SRM/gridftp split onto production FTS [Done]
- Fixed bug in fts-mon pages, fixed CE Ganglia pages [Done]
- Other: testing new ACLs [Done]; testing CMS xrootd [Ongoing]
- Misc: job plan, organizing CERN trip [Done]
Catalin
- lcgce08 gLite CREAM update (LHCb, CMS, Alice) [done]
- job plan [done]
- lcgce05 quattorisation (non-LHC) [ongoing]
- involved with CREAM CEs installation and configuration [ongoing]
- work on quattorised ATLAS Frontier installation [ongoing]
- work on BDII stability [stalled]
- update glite LFC [stalled]
Derek
(SCT - Mon-Thu, Tier 1 - Fri (or as requested))
- Documentation [ongoing]
- Quattor tidy up [ongoing]
- Handover [ongoing]
- Metrics
- From 12/6/11 20% Tier 1
VO Reports
ALICE
ATLAS
CMS
- Data & MC reprocessing ongoing
- Ongoing CASTOR issue: sometimes there are periods where incoming/outgoing transfers fail due to PrepareToPuts and PrepareToGets taking longer than 180s.
LHCb
OnCall/AoD Cover
- Primary OnCall:
- Grid OnCall: Catalin (Mon-Sun)
- AoD: