From GridPP Wiki
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Summary of Previous Week
Developments
- Alastair
- Finished security audit
- Deployed disk servers from non-prod to prod
- Went through most of castor training when Shaun wasn't too busy.
- Learnt about Tier 2 data storage allocation from Brian
- Learnt how to make changes with quattor and updated twiki
- Updated PPD twiki
- Andrew
- Completed consistency check of Aug 09 APEL & PBS; resolved problems with Oct 09 pbsjobs MySQL
- Writing Perl script to generate UB Schedule spreadsheet
- Attended CMS Offline and Computing Workshop, CERN
- Obtained CMS production role
- Meeting with a member of CMS data ops about ProdAgent
- Deleted 190,000 CMS files in /store/unmerged
- Training: CERN level 1 & 2 safety
- Catalin
- finished the ALICE disk servers deployment
- deployed and tested the FronTier/squid server for ATLAS
- installed the SL5 VOBOX for Alice
- started the drain operation for WMS03
- Derek
- Deployed updated vo config in quattor
- Fixed quattor directory creation on WNs
- Writing RAL talk for Quattor workshop
- Documenting CE information system setup
- Matt
- Deployed gLite 3.2 SL5 VOBOX
- Checked priorities for deploying Viglen 08 kit after it passes acceptance tests (meet shortfalls in ATLAS and LHCb pledges)
- Richard
- DSE Training
- 5 X disk server deployments into AtlasSimStrip
- Packaged RT helpdesk scripts plus their associated cron entries as an RPM using DR's layout
- Repackaged the gmetric-bdii-top.pl and tier1-bdii-top-config RPMs using DR's layout
- Updated the log analysis perl scripts in the gmetric-bdii-top.pl and tier1-bdii-top-config RPMs for better performance. One shows ~ 15X improvement, the other ~ 10X.
- CASTOR activities: continued development of quattor templates for servers in pre-prod instance; also DNS changes
- Mayo
- Rolled out first prototype of the new Metric Gathering System
- Collected some feedback on the new Metric Gathering System prototype
- Resolved SVN access issues
Operational Issues and Incidents
Description
|
Start
|
End
|
Affected VO(s)
|
Severity
|
Status
|
Plans for Week(s) Ahead
Plans
- Alastair
- go through gLite training
- Finish Castor training
- Update CPU efficiencies
- Test UK Frontier/Squid using Athena release 15.5.1
- Test prod/poweruser0/user permissions at the Tier 1.
- Continue updating ppd twiki on ATLAS software.
- Andrew
- Continue work on automated generation of UB Schedule spreadsheet
- Deploy a spare service node as a VOBOX using Quattor; install & setup ProdAgent; run a test production job
- Catalin
- finish deployment of SL5 VOBOX for Alice
- re-install WMS03 (hotswaping)
- integrate FronTier within ATLAS Frontier/squid network
- Derek
- Attend quattor workshop (Brussels)
- Investigate/deploy SCAS
- Matt
- Disaster recovery planning
- Richard
- Update Job Plan
- Complete quattor config/build for BDII servers
- CASTOR activities: Continue work on new pre-prod instance
- Mayo
- Collect More feedback on prototype system
- Begin working on additional functionality for future releases of the Metric System
- work on phase two of the on call documentation project
- design specification for IPMI project
Resource Requests
Downtimes
Description
|
Hosts
|
Type
|
Start
|
End
|
Affected VO(s)
|
WMS03 hotswappable
|
lcgwms03.gridpp.rl.ac.uk
|
Scheduled Outage
|
Oct 30 (09:00)
|
Nov 05 (16:00)
|
non-LHC
|
Requirements and Blocking Issues
Description
|
Required By
|
Priority
|
Status
|
HW for Squid deployment
|
ATLAS
|
High
|
request made via RT Fabric queue; used reserved hardware
|
HW for FronTier deployment
|
ATLAS
|
High
|
request made via RT Fabric queue; used reserved hardware
|
HW for SL5 64-bit VOBOX
|
Alice
|
High
|
request made via RT Fabric queue; used reserved hardware
|
Hardware for testing LFC/FTS resilience
|
|
High
|
DataServices want to deploy a DataGuard configuration to test LFC/FTS resilience; request for HW made through RT Fabric queue
|
Non-capacity HW for testing
|
|
Medium
|
Still using the old HW
|
Hardware for PPS
|
|
Medium
|
We have made a commitment to test PPS pre-releases, and have no hardware dedicated for this.
|
OnCall/AoD Cover
- Primary OnCall: Catalin (Mon-Thu)
- Grid OnCall:
- AoD: