RAL Tier1 weekly operations Grid 20110207

From GridPP Wiki
Jump to: navigation, search

Operational Issues

Description Start End Affected VO(s) Severity Status

Downtimes

Description Hosts Type Start End Affected VO(s)

Blocking Issues

Description Requested Date Required By Date Priority Status

Developments/Plans

Highlights for Tier-1 Ops Meeting

Highlights for Tier-1 VO Liaison Meeting

Detailed Individual Reports

Alastair

  • Working on ATLAS permission change. [On hold]
  • Working on 0000 Checksum issue [Done?]
  • Learnt about Quattor, disk server connection monitoring.
  • Sorting stuff for conference.
  • Deploying disk servers.

Andrew

  • Competed scripts to enable checksum checking of incoming CMS files [Done]
  • Migration to FTS groups change control & testing [Done]
  • Update fts-setup.pl for maintaining groups [Ongoing]
  • Upgraded CASTOR clients on CMS VOBOX to 2.1.9-6 [Done]
  • Fixing APEL [Done?]
  • Jan accounting [Done]
  • CMS condor training 14-15 Feb
  • Capacity planning system [Ongoing]
  • CMS data ops
    • MC rereco at IN2P3 [Ongoing]

Catalin

  • update Grid Services documentation (WMS, Frontier)
  • start get deeper involved with CREAM CEs
  • work on 'pheno' issues with WMS job submission [ongoing]
  • Frontier servlet update on ATLAS server (v3.27) [done]
  • deploy check_job_submission Nagios test on all CEs [done]

Derek

  • Deployed whole node scheduling on batch farm [Done]
  • Wrote TDG talk [Done]
  • Investigating whole node jobs effect on scheduler
  • Reviewing CE documentation
  • Tidying up/Finishing off in preparation for 2 weeks A/L

Matt

  • Test new implementation plan for FTS Group config. [New]
  • Prepare FTS hardware for Agent Quattorisation. [New]
  • Review VOBOX/CE incident. [Ongoing]
  • Meet to discuss better CE/WMS service coverage. [Done]
  • Review MyProxy Nagios plugin. [Done]
  • Disk Deployment meeting (2011 pledges). [Done]

Richard

  • New site BDII (lcgsbdii0652) now in service [Done].
  • Upgrading castor client s/w on ui01 and ui02. ui01 done (using yum update), now working on ui02 (via Quattor).
  • Trying out new hypervisor (hv-10) to see how much performance has improved (have moved an existing VM across to the new h/v) [Ongoing].
  • Developing a set of Quattor templates for an ARGUS server. Now morphed into evaluating the set of templates provided by QWG [Ongoing]
  • Working on the "team status page" being developed as an action from team awayday [Ongoing]
  • Reviewing G/S process documentation [Ongoing]
  • CASTOR items:
    • Working with SDW to import latest CASTOR quattor structure into the "cert-in-a-box" cluster. [Ongoing]

VO Reports

ALICE

ATLAS

CMS

LHCb

OnCall/AoD Cover

OnCall Rota

  • Primary OnCall:
  • Grid OnCall: Derek
  • AoD: