RAL Tier1 weekly operations Grid 20101025

From GridPP Wiki
Jump to: navigation, search

Operational Issues

Description Start End Affected VO(s) Severity Status
SW RAID problems on lcgwms03 (non-LHC) Fri 22-Oct-2010 non-LHC Fabric aware of the problem

Downtimes

Description Hosts Type Start End Affected VO(s)

Blocking Issues

Description Requested Date Required By Date Priority Status

Developments/Plans

Highlights for Tier-1 Ops Meeting

Highlights for Tier-1 VO Liaison Meeting

Detailed Individual Reports

Alastair

  • Working on ATLAS software server, testing CVMFS
    • 825 test jobs have been run.
    • lcg0805 has been setup for production style testing, need to add queue into ATLAS system.
    • Production tasks submitted.
  • Writing script to graph transfer times for FTS transfers [on hold]
  • Investigating FTS glite-3.2 transfer problems. [ongoing]
  • Finishing draining of gdss81 for removal from production.
  • Working on Hammer cloud test of castor 2.1.9 [ongoing]

Andrew

  • Capacity planning system project [Ongoing]
  • CMS data ops
    • WMAgent testing - ran backfill at all Tier-1s except FNAL [Done]

Catalin

  • deploy lcglb03 (glite3.2 LB) in production [ongoing]
  • work on glite-LB quattor profile [done]
  • investigate (x)ROOT(d) [ongoing]

Derek

  • Integration of quattor changes made for CREAM CE into production
  • Debugging lcgce03 instability

Matt

  • Testing PBS monitoring tools (pbswebmon, JobMon) [New]
  • Further testing of Quattorised gLite3.2 FTS FEs. [Ongoing]
  • Quattorisation of MyProxy nodes (Change Control approved). [Ongoing]
  • Test FTS SRM/GridFTP ratio configuration.
  • Disk Deployment meeting. [Done]
  • Quattor FTM profiles. [Done]

Richard

  • Scheduling update to RAL site-level BDIIs [Ongoing]
  • Developing a set of Quattor templates for an ARGUS server [Ongoing]
  • Developing a "pseudo-update" to apply a gLite update to BDIIs
  • Wrote a CGI script for logging hardware requests from G/S team in the Fabric queue in RT [Ongoing]
  • Working on the "team status page" being developed as an action from team awayday [Ongoing]
  • Reviewing G/S process documentation [Ongoing]
  • CASTOR items:


VO Reports

ALICE

ATLAS

CMS

LHCb

OnCall/AoD Cover

OnCall Rota

  • Primary OnCall:
  • Grid OnCall: Catalin (Mon-Thu)