RAL Tier1 weekly operations castor 26/5/2017
Contents
Draft agenda
1. Problems encountered this week
2. Upgrades/improvements made this week
3. What are we planning to do next week?
4. Long-term project updates (if not already covered)
1. SL7 upgrade on tape servers 2. SRM upgrade to SL6/CASTOR 2.1.16 3. SL5 elimination from CASTOR functional test boxes and tape verification server 4. CASTOR stress test improvement
5. Special topics
1. Future CASTOR upgrade methodology
6. Actions
7. Anything for CASTOR-Fabric?
8. AoTechnicalB
9. Availability for next week
10. On-Call
11. AoOtherB
Operation problems
gdss773 and gdss804 showed hardware problems and were removed from production
LHCb load test revealed the CV 11 disk servers of lhcbUser as bottleneck
Operation news
Atlas Stager was upgraded to 2.1.16 on Tuesday
CMS Stager/SRMs were upgraded to 2.1.16 on Thursday
Plans for next week
Upgrade Gen to 2.1.16 on Wednesday
Long-term projects
CIP migration to aquilon and upgrade to SL6
SL6 upgrade on functional test boxes and tape verification server
Tape-server migration to aquilon and SL7 upgrade (on hold at the moment)
CASTOR stress test improvement
Actions
GP to check the rate of TURL requests from LHCb
DB hardware upgrade tracking
Drain and decomission/recomission the 12 generation disk servers
RA to get a new source control management system sorted for CASTOR script development
GP to prepare a report on the performance of the WAN parameters deployed on CMS disk servers
Staffing
RA on call until Monday, then GP