RAL Tier1 weekly operations Grid 20110613

From GridPP Wiki
Revision as of 14:46, 13 June 2011 by Catalin condurache (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Operational Issues

Description Start End Affected VO(s) Severity Status

Downtimes

Description Hosts Type Start End Affected VO(s)
kickstart -> quattor lcgce05.gridpp.rl.ac.uk SD Fri 10 June Wed 15 June non-LHC
glite3.2 CREAM update lcgce08.gridpp.rl.ac.uk SD Tue 7 June Fri 10 June ATLAS, LHCb

Blocking Issues

Description Requested Date Required By Date Priority Status

Developments/Plans

Highlights for Tier-1 Ops Meeting

Highlights for Tier-1 VO Liaison Meeting

Detailed Individual Reports

Alastair

  • Catching up after several weeks of Annual Leave!!!
  • Attended Frontier workshop on Wednesday last week.
  • Working from home Thursday + Friday and for the start of this week.
  • Working on permission change. Working on fix to pilot submission + dark data cleanup, before scripts can work. Talking to Shaun about new solutions.
  • Working on ways to increase efficiency of ATLAS jobs on batch farm.
  • Various CVMFS fixes. Including new test and debugging ATLAS jobs.

Andrew

  • Setup/tested 100% Quattorized CMS squid [Done]
  • Put SRM/gridftp split onto production FTS [Done]
  • Fixed bug in fts-mon pages, fixed CE Ganglia pages [Done]
  • Other: testing new ACLs [Done]; testing CMS xrootd [Ongoing]
  • Misc: job plan, organizing CERN trip [Done]

Catalin

  • lcgce08 gLite CREAM update (LHCb, CMS, Alice) [done]
  • job plan [done]
  • lcgce05 quattorisation (non-LHC) [ongoing]
  • involved with CREAM CEs installation and configuration [ongoing]
  • work on quattorised ATLAS Frontier installation [ongoing]
  • work on BDII stability [stalled]
  • update glite LFC [stalled]

Derek

(SCT - Mon-Thu, Tier 1 - Fri (or as requested))

  • Documentation [ongoing]
  • Quattor tidy up [ongoing]
  • Handover [ongoing]
  • Metrics
  • From 12/6/11 20% Tier 1

VO Reports

ALICE

ATLAS

CMS

  • Data & MC reprocessing ongoing
  • Ongoing CASTOR issue: sometimes there are periods where incoming/outgoing transfers fail due to PrepareToPuts and PrepareToGets taking longer than 180s.

LHCb

OnCall/AoD Cover

OnCall Rota

  • Primary OnCall:
  • Grid OnCall: Catalin (Mon-Sun)
  • AoD: