RAL Tier1 weekly operations Grid 20110131

From GridPP Wiki
Jump to: navigation, search

Operational Issues

Description Start End Affected VO(s) Severity Status

Downtimes

Description Hosts Type Start End Affected VO(s)

Blocking Issues

Description Requested Date Required By Date Priority Status

Developments/Plans

Highlights for Tier-1 Ops Meeting

Highlights for Tier-1 VO Liaison Meeting

Detailed Individual Reports

Alastair

  • Working on ATLAS permission change. [On hold]
  • Work on CVMFS local/setup.sh.
  • Working on 0000 Checksum issue
  • Consistency checking ATLAS space tokens for MC -> Data disk merge.
  • Got invited to conference in Amsterdam!

Andrew

  • Migration to FTS groups for CMS [Ongoing]
  • Capacity planning system [Ongoing]
  • Dealing with more corrupt files from Estonia [Ongoing]
  • Fixing glite-APEL [Ongoing]
  • Modify PhEDEx (FileDownloadVerify) to check checksums [To do]
  • Change names/IP addresses of CMS squids [To do]
  • January accounting [To do]
  • CMS data ops
    • MC rereco/redigi at FNAL, PIC, IN2P3

Catalin

  • Frontier servlet update on ATLAS server (v3.27)
  • deploy check_job_submission Nagios test on all CEs
  • Project Management Training Course [done]
  • Group Strategy Refresh [done]

Derek

  • Write Change control for whole node scheduling [done]
  • Investigating issue on alice vo box affecting submission to lcgce09

Matt

  • Better CE/WMS service coverage. [New]
  • Review MyProxy Nagios plugin. [New]
  • Review VOBOX/CE incident. [New]
  • Disk Deployment meeting (2011 pledges). [Ongoing]
  • Write Change Control for migrating FTS Agents to Quattor host. [Done]
  • Test transferring ATLAS file with problem checksum. [Done]

Richard

  • 2 days on Project Management course [Done].
  • Applied errata to G/S testbed BDII machines [Done].
  • Trying out new hypervisor (hv-10) to see how much performance has improved (have moved an existing VM across to the new h/v) [Ongoing].
  • Developing a set of Quattor templates for an ARGUS server. Now morphed into evaluating the set of templates provided by QWG [Ongoing]
  • Working on the "team status page" being developed as an action from team awayday [Ongoing]
  • Reviewing G/S process documentation [Ongoing]
  • CASTOR items:
    • Working with SDW to import latest CASTOR quattor structure into the "cert-in-a-box" cluster. [Ongoing]

VO Reports

ALICE

ATLAS

CMS

LHCb

OnCall/AoD Cover

OnCall Rota

  • Primary OnCall: Catalin
  • Grid OnCall:
  • AoD: