RAL Tier1 weekly operations castor 27/1/2017

From GridPP Wiki
Jump to: navigation, search

1. Problems encountered this week

2. Upgrades/improvements made this week

3. What are we planning to do next week?

4. Long-term project updates (if not already covered)

 1. Castor 2.1.15
 2. SL7 upgrade on tape servers
 3. SRM upgrade to SL6

5. Special topics

6. Actions

7. Anything for CASTOR-Fabric?

8. AoTechnicalB

9. Availability for next week

10. On-Call

11. AoOtherB

Operation problems

gdss780 was not reachable after LHCb 2.1.15 upgrade and had memory swapped RT185203

DB problems following the atlas castor upgrade

The ALICE xrootd daemon did not start, possibly due to the upgrade to xrootd 4 RT185222

gdss783, gdss784, gdss786 are in downtime for alice

Operation news

ATLAS and Gen were upgraded to 2.1.5-20

No of jobs runnig in the farm was reduced as a precaution against SRM issues after 2.1.5-20 upgrade on ATLAS

PLans for next week

2.1.15 upgrade on CMS on Tuesday

Long-term projects

Tape-server migration to aquilon and SL7 upgrade

SRM 2.1.16/SL6 upgrade

Actions

Drain 10% of the 13 generation of disk servers (lhcbDst) for decommissioning

Add GP to the mail of CASTOR overwatch script

Staffing

RA on call next week