From GridPP Wiki
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Summary of Previous Week
Developments
- Alastair
- Looked into results of Hammer Cloud test to understand Frontier Performance.
- Made progress with getting ATLAS powerusers to run at the Tier 1.
- Updated RAL PP twiki with feedback from ATLAS meeting.
- Andrew
- Added checksum checking of migrated files to PhEDEx production instance
- Wrote Nagios plugin for checking user proxy on CMS VOBOX
- Added (hidden) option to capacity & efficiency ganglia pages for specifying units (KSI2K or HEP-SPEC06)
- Added options to all UB schedule scripts for HEP-SPEC06 option
- Wrote documentation about adding new VO to UB schedule scripts
- Preparations for CMS Data Ops training
- Training: online display screen equipment course & self-assessment
- Catalin
- worked on SL5 LHCb VOBOX quattorised deployment
- closed the t2k.org issue (user error)
- WMS03 (non-LHC) update
- Derek
- Test CREAM CE reinstallation instructions
- Created and tested quattor template to implement BLParser service
- Added updated voms certificates to yaim config rpm
- Listened in on GDB
- Matt
- Tested R-GMA recovery (Flexible Archiver component)
- Worked with Carmine on LFC recovery plans
- Produced 2009/Q4 FTS metrics for quarterly report
- Richard
- 2 days A/L
- Finished plan for BDII changes
- Continued writing discussion document for DNS proposal
- Continued work on the CASTOR pre-prod instance
- Built a test machine as a BDII server to test quattor templates
- Worked with JK and GS on a script to check CASTOR checksums
- Mayo
- Encrypted passwords within the Metric system
- Added a change password feature to the metric system
- Fixed a bug within the Metric system
- Worked on tape statistics spreadsheet project: converting excel chatrs to HTML
Operational Issues and Incidents
Description
|
Start
|
End
|
Affected VO(s)
|
Severity
|
Status
|
FTS DB performance problems
|
20100115 11:00
|
20100115 16:00
|
LHC
|
High
|
Load on Orisa nodes redistributed across nodes by reconfiguring FTS agents.
|
Plans for Week(s) Ahead
Plans
- Alastair
- Run (hopefully) final tests on Frontier server (after Catalin has performed servlet update) to confirm it is working well.
- Continue updating RAL PP twiki.
- Complete version 1 of Tier 1 VO requirements with information that has been provided by Raja.
- Possibly away/working from home Tuesday (Depends how long Hospital appointment takes)
- Andrew
- Joining CMS Data Ops - away at CERN for training
- Catalin
- finalise SL5 LHCb VOBOX deployment (hotswapping issues)
- follow up some post-reboot WMS issues with CERN
- work on LFC schemas tidying up (with Carmine)
- exercise Alice xrootd (manager + peer) re-installation (on old SL4 voboxes)
- Derek
- Implementing BLParser on lcgbatch01
- Completing testing of CE and CREAM CE for Intervention changes
- GLexec and SCAS on SL5
- Matt
- Finish Grid Services Disaster Recovery document
- Planning ATLAS/R89 co-hosting of Grid Services
- Provide test site BDII for CIP upgrade testing
- Richard
- Finish discussion document for DNS proposal
- Continue working on CASTOR pre-prod instance
- Further work on the Quattor templates for BDII server
- Re-do existing STP time bookings and enter EGEE timesheets back to starting date
- 2 days A/L
- Mayo
- Automating Metric report system
- Adding charts to the metric system
- Web interface and script to fetch data for Tape robot statistics spreadsheet project
Resource Requests
Downtimes
Description
|
Hosts
|
Type
|
Start
|
End
|
Affected VO(s)
|
FTS DB problems
|
Orisa, FTS agents
|
Unscheduled
|
20100115 11:00
|
20100115 16:00
|
LHC
|
Requirements and Blocking Issues
Description
|
Required By
|
Priority
|
Status
|
Hardware for testing LFC/FTS resilience
|
|
High
|
DataServices want to deploy a DataGuard configuration to test LFC/FTS resilience; request for HW made through RT Fabric queue
|
Hardware for Testbed
|
|
High
|
Required for change validation, load testing, etc. Also for phased rollout (which replaces PPS).
|
Hardware for SCAS servers
|
Feb 1 2010
|
High
|
Hardware required for production SCAS servers - required to be in place by end of Feb
|
Hardware for SL5 CREAM CE for Non LHC SL5 batch access
|
|
Medium
|
Hardware required for CREAM CE for non-LHC vos
|
Pool accounts for Super B vo
|
|
Medium
|
Required to enable Super B vo on batch farm
|
OnCall/AoD Cover
- Primary OnCall: Catalin (Mon-Sun)
- Grid OnCall:
- AoD: Catalin (Wed)