RAL Tier1 weekly operations Grid 20110131
From GridPP Wiki
Operational Issues
Description
|
Start
|
End
|
Affected VO(s)
|
Severity
|
Status
|
|
|
|
|
|
|
Downtimes
Description
|
Hosts
|
Type
|
Start
|
End
|
Affected VO(s)
|
Blocking Issues
Description
|
Requested Date
|
Required By Date
|
Priority
|
Status
|
|
|
|
|
|
Developments/Plans
Highlights for Tier-1 Ops Meeting
Highlights for Tier-1 VO Liaison Meeting
Detailed Individual Reports
Alastair
- Working on ATLAS permission change. [On hold]
- Work on CVMFS local/setup.sh.
- Working on 0000 Checksum issue
- Consistency checking ATLAS space tokens for MC -> Data disk merge.
- Got invited to conference in Amsterdam!
Andrew
- Migration to FTS groups for CMS [Ongoing]
- Capacity planning system [Ongoing]
- Dealing with more corrupt files from Estonia [Ongoing]
- Fixing glite-APEL [Ongoing]
- Modify PhEDEx (FileDownloadVerify) to check checksums [To do]
- Change names/IP addresses of CMS squids [To do]
- January accounting [To do]
- CMS data ops
- MC rereco/redigi at FNAL, PIC, IN2P3
Catalin
- Frontier servlet update on ATLAS server (v3.27)
- deploy check_job_submission Nagios test on all CEs
- Project Management Training Course [done]
- Group Strategy Refresh [done]
Derek
- Write Change control for whole node scheduling [done]
- Investigating issue on alice vo box affecting submission to lcgce09
Matt
- Better CE/WMS service coverage. [New]
- Review MyProxy Nagios plugin. [New]
- Review VOBOX/CE incident. [New]
- Disk Deployment meeting (2011 pledges). [Ongoing]
- Write Change Control for migrating FTS Agents to Quattor host. [Done]
- Test transferring ATLAS file with problem checksum. [Done]
Richard
- 2 days on Project Management course [Done].
- Applied errata to G/S testbed BDII machines [Done].
- Trying out new hypervisor (hv-10) to see how much performance has improved (have moved an existing VM across to the new h/v) [Ongoing].
- Developing a set of Quattor templates for an ARGUS server. Now morphed into evaluating the set of templates provided by QWG [Ongoing]
- Working on the "team status page" being developed as an action from team awayday [Ongoing]
- Reviewing G/S process documentation [Ongoing]
- CASTOR items:
- Working with SDW to import latest CASTOR quattor structure into the "cert-in-a-box" cluster. [Ongoing]
VO Reports
ALICE
ATLAS
CMS
LHCb
OnCall/AoD Cover
OnCall Rota
- Primary OnCall: Catalin
- Grid OnCall:
- AoD: