RAL Tier1 weekly operations Grid 20101004

From GridPP Wiki
Revision as of 14:37, 4 October 2010 by Matt hodges (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Operational Issues

Description Start End Affected VO(s) Severity Status

Downtimes

Description Hosts Type Start End Affected VO(s)

Blocking Issues

Description Requested Date Required By Date Priority Status

Developments/Plans

Highlights for Tier-1 Ops Meeting

Highlights for Tier-1 VO Liaison Meeting

Detailed Individual Reports

Alastair

  • Working on ATLAS software server, testing CVMFS
    • 825 test jobs have been run.
    • lcg0805 has been setup for production style testing, need to add queue into ATLAS system.
    • Production tasks submitted.
  • Writing script to graph transfer times for FTS transfers
  • Working on Hammer cloud test of castor 2.1.9
    • Analysis queue setup
    • Need to copy DBrelease into pre-prod and replicate
  • Deploying gdss5xx series to atlasStripInput
  • Written checksumming script for diskservers

Andrew

  • Prepared VO Support Survey presentation [Done]
  • Updating scripts due to pbsjobs database moving home [Done]
  • Dealing with RAL-FNAL issue [Done]
  • Setting up Squids for CMS (re-installing, testing, writing documentation) [Done]
  • CMS data ops
    • Running data rereco at RAL, PIC, FNAL, ASGC; MC rereco at IN2P3, FNAL, RAL [Ongoing]
    • WMAgent training session [Done]

Catalin

  • work on glite-LB quattor profile [ongoing]
  • investigate (x)ROOT(d)
  • migrate remaining databases [ongoing]
  • kernel upgrades on SL5 nodes [done]
  • halt old SL4 LFC FEs [done]
  • prepare nodes in ATLAS building for power shutdown [done]

Derek

  • CREAM CE quattor profile [ongoing]
  • Investigating CREAM CE instability [ongoing]
  • Sync'd quattor templates to QWG

Matt

  • Further testing of Quattorised gLite3.2 FTS FEs. [Ongoing]
  • Quattorisation of MyProxy nodes (write up Change Control). [Ongoing]
  • Test FTS SRM/GridFTP ratio configuration.
  • Prep for Tier-1 Resources meeting. [New]
  • Nagios logfile monitoring development. [New]

Richard

  • Prepping for filesystem re-do on RAL top-level BDIIs
  • Adding extra ganglia monitoring to BDIIs
  • Working on the "team status page" being developed as an action from team awayday [ongoing]
  • Reviewing G/S process documentation [ongoing]
  • CASTOR items:
    • Ran 2.1.9 functional test suite on LHCB instance of CASTOR [Done]


VO Reports

ALICE

ATLAS

CMS

LHCb

OnCall/AoD Cover

OnCall Rota

  • Primary OnCall:
  • Grid OnCall: Catalin (Mon-Sun)