RAL Tier1 weekly operations castor 27/07/2009
From GridPP Wiki
Contents
Summary of Previous Week
- SRM development. Testing now underway at CERN (Shaun)
- Setting up BDII host for Certification for CIP testing (Cheney)
- c2probe work (Cheney)
- Investigating low data rates between BNL and RAL. Suspect it's max. TCP window size (Brian)
- Cleaning ATLAS namespace (Brian)
- Add final 2 tape drives to CASTOR - all online now(Tim)
- Deployed 2.1.8-8 on 4 more tape servers (Chris)
- Setting up SRM on certification to work with production for CMS (Shaun)
- Rewriting and improving puppet restarter (Shaun)
- CIP development on certification (Jens)
- CASTOR Disaster Recovery document (Matt)
- Increased lhcbDst total ROOTD slots from 100-400 (Matt)
Developments for this week
- CIP development on certification (Jens)
- Investigate 2.1.8 NS client on 2.1.7 NS DB (Chris)
- Deployed 2.1.8-8 on remaining tape servers (Chris)
- Configuring disk servers to use gridftp-internal (Chris)
- ATLAS bulk deletion on files in CASTOR but not in LSF (Brian)
- Tier1 Production Manager (Matt)
- CASTOR Disaster Recovery document (Matt)
- Interviewing for CASTOR Pre Production Service Manager post (Matt)
- Establish with experiments when to intervene with hardware on D0T1 disk servers (Matt)
- Write CASTOR status for oversight committee (Matt)
- Find out more about CERN's virtualized certification setup (Chris, Matt)
- Python 3 training course (Cheney)
Ongoing
- Fix missing graphs on castormon - Atlas tape migration and aggregated Gen monitoring(Brian)
- Cleaning up database for a future 2.1.8 upgrade (Shaun)
- Setting up Preproduction (Matt)
Operations Issues
none
Blocking issues
none
Scheduled and Cancelled Down Times
none
Changes to Production Milestones
none
Advanced Planning
- Work with Fabric to add extra RAID card in remaining Viglen'06 disk servers (July/August)
- SRM 2.8 upgrade (sometime during August)
- CIP upgrade to include nearline publishing (July/August)
- Improve resiliency to central services (This year)
- Upgrade nameserver to 2.1.8 (Possibly during September)
- Database optimization tasks (September)
- Black and White lists? In discussion with ATLAS to establish their requirements
Staffing
- Castor on Call person: Chris
- Shaun on A/L
- Matt on TOIL Mon morning, A/L Thurs