Difference between revisions of "RAL Tier1 weekly operations castor 27/1/2017"

From GridPP Wiki
Jump to: navigation, search
(Operation problems)
(Operation news)
 
(4 intermediate revisions by one user not shown)
Line 27: Line 27:
 
== Operation problems ==
 
== Operation problems ==
  
gdss780 was not reachable after LHCb 2.1.15 upgrade and had memory swapped [[https://helpdesk.gridpp.rl.ac.uk/Ticket/Display.html?id=185203 RT185203]]  
+
gdss780 was not reachable after LHCb 2.1.15 upgrade and had memory swapped [https://helpdesk.gridpp.rl.ac.uk/Ticket/Display.html?id=185203 RT185203]
  
 
DB problems following the atlas castor upgrade
 
DB problems following the atlas castor upgrade
  
The ALICE xrootd daemon did not  start, possibly due to the upgrade to xrootd 4
+
The ALICE xrootd daemon did not  start, possibly due to the upgrade to xrootd 4 [https://helpdesk.gridpp.rl.ac.uk/Ticket/Display.html?id=185222 RT185222]
  
 
gdss783, gdss784, gdss786 are in downtime for alice
 
gdss783, gdss784, gdss786 are in downtime for alice
Line 38: Line 38:
  
 
ATLAS and Gen were upgraded to 2.1.5-20
 
ATLAS and Gen were upgraded to 2.1.5-20
 +
 +
No of jobs runnig in the farm was reduced as a precaution against SRM issues after 2.1.5-20 upgrade on ATLAS
 +
 +
== PLans for next week ==
 +
 +
2.1.15 upgrade on CMS on Tuesday
  
 
== Long-term projects ==
 
== Long-term projects ==
Line 43: Line 49:
 
Tape-server migration to aquilon and SL7 upgrade
 
Tape-server migration to aquilon and SL7 upgrade
  
SRM 2.1.16 upgrade  
+
SRM 2.1.16/SL6 upgrade
  
 
== Actions ==
 
== Actions ==

Latest revision as of 09:14, 3 February 2017

1. Problems encountered this week

2. Upgrades/improvements made this week

3. What are we planning to do next week?

4. Long-term project updates (if not already covered)

 1. Castor 2.1.15
 2. SL7 upgrade on tape servers
 3. SRM upgrade to SL6

5. Special topics

6. Actions

7. Anything for CASTOR-Fabric?

8. AoTechnicalB

9. Availability for next week

10. On-Call

11. AoOtherB

Operation problems

gdss780 was not reachable after LHCb 2.1.15 upgrade and had memory swapped RT185203

DB problems following the atlas castor upgrade

The ALICE xrootd daemon did not start, possibly due to the upgrade to xrootd 4 RT185222

gdss783, gdss784, gdss786 are in downtime for alice

Operation news

ATLAS and Gen were upgraded to 2.1.5-20

No of jobs runnig in the farm was reduced as a precaution against SRM issues after 2.1.5-20 upgrade on ATLAS

PLans for next week

2.1.15 upgrade on CMS on Tuesday

Long-term projects

Tape-server migration to aquilon and SL7 upgrade

SRM 2.1.16/SL6 upgrade

Actions

Drain 10% of the 13 generation of disk servers (lhcbDst) for decommissioning

Add GP to the mail of CASTOR overwatch script

Staffing

RA on call next week