RAL Tier1 weekly Operations Grid 20121210

From GridPP Wiki
Jump to: navigation, search

Operational Issues

Description Start End Affected VO(s) Severity Status

Downtimes

Description Hosts Type Start End Affected VO(s)

Blocking Issues

Description Requested Date Required By Date Priority Status

Developments/Plans

Highlights for Tier-1 Ops Meeting

Highlights for Tier-1 VO Liaison Meeting

Detailed Individual Reports

Andrew

  • Last week:
    • Searched for, found, modified & built Maui rpms
    • Fixed issue with NA62 CPU accounting
    • Learned about shared-storage cluster, tested MyProxy
    • Re-installed lcgsquid0677 (lots of problems)
    • Various kernel/errata upgrades
    • More batch system testing
    • CMS processing
  • Coming week:
    • FTS3 Nagios tests
    • Deploy new Maui
    • Upgrade PhEDEx
    • Write APEL upgrade to UMD2 change control
    • CMS processing

Catalin

  • Last week
  • This week

Ian

  • Last week:
  • Coming week:
    • Atlas Jamboree CERN
    • GDB

James

  • Last Week
  • This Week

Orlin

  • Quattorise, Install & Test EMI2/SL6 WNs on the gridTetst queue [ongoing]
  • Bring the Testbed back in order, check the list of services [ongoing]
  • Upgrade WNs to SL5 EMI2 [done]
  • Learn more about Frontier services and install test lcgvo01/02 frontiers following Alastair's wikipage [ongoing]
  • Learn more about virtualization clustering from Martin and implement failover-cluster for Argus [ongoing]
  • Prepare & Submit change-control for EMI2/SL6 Argus Server [done]
  • Assign some production WNs to authenticate with EMI2/SL6 Argus Server [to do]
  • Test a possibility of EMI2/SL6 WN - preinstalled cloud image with a batch-client [to do]
  • Test and compare jobs running on cloud/hypervisor with physical hardware [to do]
  • Test & implement Extra monitoring tools for CREAMCEs (if necessary) [to do]
  • Grid certificates and elastic FTS [to think about]
  • Test High Availability & failover for Argus server with Corosync/RGManager/CMAN [to think about - low priority]

VO Reports

ALICE

ATLAS

CMS

LHCb

OnCall/AoD Cover

OnCall Rota

  • Primary OnCall: Ian (Mon-Sun)
  • Grid OnCall: Andrew (Mon-Sun)

Absences

Ian: CERN Mon-Thu

Catalin: CERN Course Mon-Wed