RAL Tier1 weekly operations castor 12/5/2017
Contents
Draft agenda
1. Problems encountered this week
2. Upgrades/improvements made this week
3. What are we planning to do next week?
4. Long-term project updates (if not already covered)
1. SL7 upgrade on tape servers 2. SRM upgrade to SL6/CASTOR 2.1.16 3. SL5 elimination from CASTOR functional test boxes and tape verification server 4. CASTOR stress test improvement
5. Special topics
6. Actions
7. Anything for CASTOR-Fabric?
8. AoTechnicalB
9. Availability for next week
10. On-Call
11. AoOtherB
Operation problems
The CASTOR upgrade of LHCb stager was not carried out on Tuesday 9/5 as planned due to an installation propblem with aquilon
The nsd daemon on cmsdlf node did not start after the upgrade
The nsd daemon on xroot-cms-manager was not working
The printndiskcopy tool, that replaced diskserver_qry CASTOR 2.1.16) outputs only the top 1000 files from disk server (will get and install the latest version from CERN)
Operation news
Tier NS was upgraded on Tuesday
LHCb stager was upgraded to CASTOR 2.1.16-13 and SRMs were upgraded to to CASTOR 2.1.16-10 on Thursday
Plans for next week
Trip to CERN for the CASTOR/Ceph F2F meeting
Long-term projects
CIP migration to aquilon and upgrade to SL6
SL6 upgrade on functional test boxes and tape verification server
Tape-server migration to aquilon and SL7 upgrade (on hold at the moment)
CASTOR stress test improvement
Actions
DB hardware upgrade tracking
Drain and decomission/recomission the 12 generation disk servers
RA to get a new source control management system sorted for CASTOR script development
GP to prepare a report on the performance of the WAN parameters deployed on CMS disk servers
Staffing
RA until Monday and GP from Tue onwards