RAL Tier1 weekly operations Grid 20110124
From GridPP Wiki
Revision as of 15:24, 24 January 2011 by Matt hodges (Talk | contribs)
Contents
Operational Issues
Description | Start | End | Affected VO(s) | Severity | Status |
---|---|---|---|---|---|
Downtimes
Description | Hosts | Type | Start | End | Affected VO(s) |
---|
Blocking Issues
Description | Requested Date | Required By Date | Priority | Status |
---|---|---|---|---|
Developments/Plans
Highlights for Tier-1 Ops Meeting
Highlights for Tier-1 VO Liaison Meeting
Detailed Individual Reports
Alastair
- ATLAS TaskForce [ongoing]
- Working on ATLAS permission change. [On hold]
- Checksumming 16k ATLAS Tape files.
- Help setting up CVMFS at RAL PP.
- Putting ATLAS squids into production for ATLAS and testing failover works.
Andrew
- Sorting out APEL; problem due to corrupted SpecRecords table. APEL developers investigating. [Ongoing]
- Migration to FTS groups for CMS [Ongoing]
- Investigating corrupt files written into CASTOR over Christmas holidays [Done]
- Capacity planning system project [Ongoing]
- CMS data ops
- Dec22 data rereco postmortem
- Data, MC rereco
Catalin
- Group Strategy Refresh
- Project Management Training Course
- WMS03 disk replacement (with Fabric) [done]
- Frontier servlet update on ATLAS server [done]
- ATLAS squid nodes deployment [done]
Derek
- Revised change control for batch job OS selection mechanism [done]
- Errata updates to systems [done]
- Testing implementation of whole node scheduling [done]
- Write Change control for whole node scheduling
- Nagios test for basic job submission from CEs [ongoing]
Matt
- Write Change Control for migrating FTS Agents to Quattor host. [New]
- Test transferring ATLAS file with problem checksum. [New]
- Disk Deployment meeting (2011 pledges). [Ongoing]
- Prep for Strategy Refresh. [Done]
- Test FTS SRM/GridFTP ratio configuration. [Stalled]
Richard
- Developing a set of Quattor templates for an ARGUS server. Now morphed into evaluating the set of templates provided by QWG [Ongoing]
- Working on the "team status page" being developed as an action from team awayday [Ongoing]
- Reviewing G/S process documentation [Ongoing]
- CASTOR items:
- Working with SDW to import latest CASTOR quattor structure into the "cert-in-a-box" cluster. [Ongoing]
VO Reports
ALICE
ATLAS
CMS
LHCb
OnCall/AoD Cover
- Primary OnCall:
- Grid OnCall: Derek
- AoD: