From GridPP Wiki
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Summary of Previous Week
Developments
- Alastair
- Started working on ATLAS code to test different permissions at the Tier 1.
- Updated User Board CPU allocations.
- Continued work producing list of experiment requirements.
- Helped Brian with the disk server problems on gdss403.
- Andrew
- FTS channels: changed STAR-UKILT2ICHEP from SRMCOPY to URLCOPY; drained channels to/from RAL tier-2
- Deployed csflnx414 as a CMS VOBOX for testing
- Installed ProdAgent on csflnx414; learning about ProdAgent; submitted a test workflow to RAL
- Continued development of automated generation of UB schedule spreadsheet
- Testing of CMS skimming jobs
- Attended OPB "Managing data with robots"
- Catalin
- re-installed WMS03 and made it hotswappable
- completed FronTier for ATLAS installation/configuration
- progress with SL5 VOBOX for Alice
- various glite and kernel upgrades
- Derek
- Incorporated comments on Quattor status talk
- Attending Quattor workshop
- Set lcgce01 to Production Status
- Matt
- Backout 64-bit torque/maui on scheduler
- Metrics feedback to Mayo
- Tested footprints helpdesk for tracking Grid Service issues; provided feedback to Gareth
- Richard
- Updated Job Plan
- CASTOR activities: Built a complete set of Quattor templates for the 4 machines in new pre-prod instance (and exposed a couple of bugs in Quattor in the process!)
- Mayo
- Collected some feedback on the new Metric Gathering System prototype
- Created report view for new Metric Gathering System
- Begun writing script to extract data from Nagios-alarm-response-Grid spreadsheet for importing in to svn
- Working on script for automating data collection of tape robot statistics into a spreadsheet
Operational Issues and Incidents
Description
|
Start
|
End
|
Affected VO(s)
|
Severity
|
Status
|
PBS service failed to restart
|
2009-11-05 13:15
|
2009-11-05 13:50
|
All
|
Minor
|
Resolved by rolling back to 32-bit torque/maui
|
Plans for Week(s) Ahead
Plans
- Alastair
- Continue with any remaining training/Tutorials
- Test prod/poweruser0/user permissions at the Tier 1.
- Produce first draft of experiment requirements by Wednesday.
- Work on more efficient code for testing checksums on a disk server.
- Andrew
- Deploy gdss383 to cmsFarmRead
- Continue work on automated generation of UB Schedule spreadsheet
- Add MySQL client, Spreadsheet::WriteExcel, Spreadsheet::ParseExcel to lcgui02
- Continue learning about ProdAgent
- Continue investigating ReadAhead and LazyDownload on CMS skimming jobs
- R89 machine room training
- Catalin
- finish deployment of SL5 VOBOX for Alice
- deploy 2nd ALICE VOBOX (see Fabric helpdesk request)
- kernel upgrades
- Derek
- Matt
- Disaster recovery planning
- Richard
- CASTOR activities: Add support for castor config files into Quattor templates
- Apply the recent quattor experience to completing quattor config/build for BDII servers
- Mayo
- Collect more feedback on prototype system
- Working on additional functionality for future releases of the Metric System
- Continue work on script for extracting data from Nagios-alarm-response-Grid spreadsheet for importing into svn
- Continue work on script for automating tape robot spreadsheet
Resource Requests
Downtimes
Description
|
Hosts
|
Type
|
Start
|
End
|
Affected VO(s)
|
kernel upgrades
|
all CEs
|
at risk
|
Wed 11 Nov 09:30
|
Wed 11 Nov 12:00
|
all
|
Requirements and Blocking Issues
Description
|
Required By
|
Priority
|
Status
|
Hardware for 2nd ALICE SL5 64bit VOBOX
|
16 Nov 2009
|
High
|
Request to re-deploy lcg0614 (ALICE SW WN) as SL5 VOBOX (using quattor or not) - RT#53338
|
Hardware for LHCb SL5 64bit VOBOX
|
25 Nov 2009
|
Medium
|
Request for HW allocation (RT#53392)
|
Hardware for testing LFC/FTS resilience
|
|
High
|
DataServices want to deploy a DataGuard configuration to test LFC/FTS resilience; request for HW made through RT Fabric queue
|
Hardware for PPS
|
|
High
|
We have made a commitment to test PPS pre-releases, and have no hardware dedicated for this.
|
OnCall/AoD Cover
- Primary OnCall:
- Grid OnCall: Derek (Mon-Sun)
- AoD: