RAL Tier1 weekly operations castor 03/2/2017

From GridPP Wiki
Jump to: navigation, search

Draft agenda

1. Problems encountered this week

2. Upgrades/improvements made this week

3. What are we planning to do next week?

4. Long-term project updates (if not already covered)

 1. Castor 2.1.15
 2. SL7 upgrade on tape servers
 3. SRM upgrade to SL6

5. Special topics

6. Actions

7. Anything for CASTOR-Fabric?

8. AoTechnicalB

9. Availability for next week

10. On-Call

11. AoOtherB

Operations problems

Lhcb transfer failures due to excessive number of open DB cursor leaks

CMS SAM tests have been failing since the CASTOR upgrade to 2.1.15 RT185411

gdss776 down

Operations news

CMS was upgraded to CASTOR to 2.1.15

Plans for next week

2.1.15 upgrade on facilities on Tuesday

Long-term projects

CIP migration to aquilon and upgrade to SL6

SRM upgrade to SL6

Tape-server migration to aquilon and SL7 upgrade

Actions

Drain 10% of the 13 generation of disk servers (lhcbDst) for decommissioning

Add GP to the mail of CASTOR overwatch script

GP to plan a meeting with Gareth about patching the Tier-1 CASTOR name servers

Search the logs from SAM tests and hack into working

Staffing

GP on call next week