RAL Tier1 weekly operations Grid 20101122
From GridPP Wiki
Revision as of 13:37, 23 November 2010 by Alastair dewhurst (Talk | contribs)
Contents
Operational Issues
Description | Start | End | Affected VO(s) | Severity | Status |
---|---|---|---|---|---|
Downtimes
Description | Hosts | Type | Start | End | Affected VO(s) |
---|
Blocking Issues
Description | Requested Date | Required By Date | Priority | Status |
---|---|---|---|---|
Developments/Plans
Highlights for Tier-1 Ops Meeting
Highlights for Tier-1 VO Liaison Meeting
Detailed Individual Reports
Alastair
- ATLAS TaskForce
- Fixing bugs with ATLAS re-processing to make sure it runs smoothly at RAL.
- Working on returning gdss391 to production.
- Working on ATLAS permission change. [On hold]
- Preparing for CERN trip next week. (Producing talks)
Andrew
- Capacity planning system project [Ongoing]
- CMS CASTOR testing [Ongoing]
- CMS data ops
- Pile-up MC reprocessing at FNAL [Done]
- Accounting for Nov4 rereco
- Skims at FNAL [Ongoing]
- WMAgent testing [Ongoing]
Catalin
- LB service migration to gLite3.2 [done]
- work on (x)ROOT(d); deploy test infrastructure [ongoing]
- test squid on LHCb VOBOX [done]
- update glite-WMS
- work on Tier1 DB migration plans
- work on WMS monitoring [ongoing]
Derek
- Investigation of secure deployment of ssh keys to hosts [ongoing]
- Reinstalling lcgce08 [ongoing]
- Investigating solutions for whole node scheduling [ongoing]
- Attending NGS Innovation Forum (Tue-Wed)
Matt
- Switch to gLite 3.2 FTS frontends (November 24). [New]
- Reprofile disk capacity. [New]
- Deploy top BDII on EC2. [Ongoing]
- Writing storage testbed proposal. [Ongoing]
- Quattorisation FTM. [Ongoing]
- Deploying PBS JobMon monitoring tools. [Stalled]
- Test FTS SRM/GridFTP ratio configuration. [Stalled]
- Quattorisation of MyProxy nodes. [Done]
- Further testing of Quattorised gLite3.2 FTS FEs. [Done]
Richard
- 1.5 days A/L
- Working on the tool for automatic the checking of middleware baselines [Ongoing]
- Developing a set of Quattor templates for an ARGUS server [Ongoing]
- Developing a "pseudo-update" to apply gLite update 19 to BDIIs [Ongoing]
- Updated the CGI script for logging hardware requests from G/S team in the Fabric queue in RT [Ongoing]
- Working on the "team status page" being developed as an action from team awayday [Ongoing]
- Reviewing G/S process documentation [Ongoing]
- CASTOR items:
- Applied RPM errata and kernel versions to the on the 4 CIP servers.
VO Reports
ALICE
ATLAS
CMS
- Daily metric has been ERROR since CASTOR upgrade. Site readiness is now NOT READY.
LHCb
OnCall/AoD Cover
- Primary OnCall:
- Grid OnCall: Derek (Mon-Sun)
- AoD: Catalin (Wed)