Difference between revisions of "RAL Tier1 weekly operations Grid 20101122"

From GridPP Wiki
Jump to: navigation, search
 
(No difference)

Latest revision as of 13:37, 23 November 2010

Operational Issues

Description Start End Affected VO(s) Severity Status

Downtimes

Description Hosts Type Start End Affected VO(s)

Blocking Issues

Description Requested Date Required By Date Priority Status

Developments/Plans

Highlights for Tier-1 Ops Meeting

Highlights for Tier-1 VO Liaison Meeting

Detailed Individual Reports

Alastair

  • ATLAS TaskForce
  • Fixing bugs with ATLAS re-processing to make sure it runs smoothly at RAL.
  • Working on returning gdss391 to production.
  • Working on ATLAS permission change. [On hold]
  • Preparing for CERN trip next week. (Producing talks)

Andrew

  • Capacity planning system project [Ongoing]
  • CMS CASTOR testing [Ongoing]
  • CMS data ops
    • Pile-up MC reprocessing at FNAL [Done]
    • Accounting for Nov4 rereco
    • Skims at FNAL [Ongoing]
    • WMAgent testing [Ongoing]

Catalin

  • LB service migration to gLite3.2 [done]
  • work on (x)ROOT(d); deploy test infrastructure [ongoing]
  • test squid on LHCb VOBOX [done]
  • update glite-WMS
  • work on Tier1 DB migration plans
  • work on WMS monitoring [ongoing]

Derek

  • Investigation of secure deployment of ssh keys to hosts [ongoing]
  • Reinstalling lcgce08 [ongoing]
  • Investigating solutions for whole node scheduling [ongoing]
  • Attending NGS Innovation Forum (Tue-Wed)

Matt

  • Switch to gLite 3.2 FTS frontends (November 24). [New]
  • Reprofile disk capacity. [New]
  • Deploy top BDII on EC2. [Ongoing]
  • Writing storage testbed proposal. [Ongoing]
  • Quattorisation FTM. [Ongoing]
  • Deploying PBS JobMon monitoring tools. [Stalled]
  • Test FTS SRM/GridFTP ratio configuration. [Stalled]
  • Quattorisation of MyProxy nodes. [Done]
  • Further testing of Quattorised gLite3.2 FTS FEs. [Done]

Richard

  • 1.5 days A/L
  • Working on the tool for automatic the checking of middleware baselines [Ongoing]
  • Developing a set of Quattor templates for an ARGUS server [Ongoing]
  • Developing a "pseudo-update" to apply gLite update 19 to BDIIs [Ongoing]
  • Updated the CGI script for logging hardware requests from G/S team in the Fabric queue in RT [Ongoing]
  • Working on the "team status page" being developed as an action from team awayday [Ongoing]
  • Reviewing G/S process documentation [Ongoing]
  • CASTOR items:
    • Applied RPM errata and kernel versions to the on the 4 CIP servers.

VO Reports

ALICE

ATLAS

CMS

  • Daily metric has been ERROR since CASTOR upgrade. Site readiness is now NOT READY.

LHCb

OnCall/AoD Cover

OnCall Rota

  • Primary OnCall:
  • Grid OnCall: Derek (Mon-Sun)
  • AoD: Catalin (Wed)