Difference between revisions of "RAL Tier1 weekly operations castor 03/2/2017"

From GridPP Wiki
Jump to: navigation, search
(Created page with "== Draft agenda == 1. Problems encountered this week 2. Upgrades/improvements made this week 3. What are we planning to do next week? 4. Long-term project updates (if not ...")
 
 
(One intermediate revision by one user not shown)
Line 26: Line 26:
  
 
11. AoOtherB
 
11. AoOtherB
 +
 +
== Operations problems ==
 +
 +
Lhcb transfer failures due to excessive number of open DB cursor leaks
 +
 +
CMS SAM tests have been failing since the CASTOR upgrade to 2.1.15 [https://helpdesk.gridpp.rl.ac.uk/Ticket/Display.html?id=185411 RT185411]
 +
 +
gdss776 down
 +
 +
== Operations news ==
 +
 +
CMS was upgraded to CASTOR to 2.1.15
 +
 +
== Plans for next week ==
 +
 +
2.1.15 upgrade on facilities on Tuesday
 +
 +
== Long-term projects ==
 +
 +
CIP migration to aquilon and upgrade to SL6
 +
 +
SRM upgrade to SL6
 +
 +
Tape-server migration to aquilon and SL7 upgrade
 +
 +
== Actions ==
 +
 +
Drain 10% of the 13 generation of disk servers (lhcbDst) for decommissioning
 +
 +
Add GP to the mail of CASTOR overwatch script
 +
 +
GP to plan a meeting with Gareth about patching the Tier-1 CASTOR name servers
 +
 +
Search the logs from SAM tests and hack into working
 +
 +
== Staffing ==
 +
 +
GP on call next week

Latest revision as of 14:58, 9 February 2017

Draft agenda

1. Problems encountered this week

2. Upgrades/improvements made this week

3. What are we planning to do next week?

4. Long-term project updates (if not already covered)

 1. Castor 2.1.15
 2. SL7 upgrade on tape servers
 3. SRM upgrade to SL6

5. Special topics

6. Actions

7. Anything for CASTOR-Fabric?

8. AoTechnicalB

9. Availability for next week

10. On-Call

11. AoOtherB

Operations problems

Lhcb transfer failures due to excessive number of open DB cursor leaks

CMS SAM tests have been failing since the CASTOR upgrade to 2.1.15 RT185411

gdss776 down

Operations news

CMS was upgraded to CASTOR to 2.1.15

Plans for next week

2.1.15 upgrade on facilities on Tuesday

Long-term projects

CIP migration to aquilon and upgrade to SL6

SRM upgrade to SL6

Tape-server migration to aquilon and SL7 upgrade

Actions

Drain 10% of the 13 generation of disk servers (lhcbDst) for decommissioning

Add GP to the mail of CASTOR overwatch script

GP to plan a meeting with Gareth about patching the Tier-1 CASTOR name servers

Search the logs from SAM tests and hack into working

Staffing

GP on call next week