Difference between revisions of "RAL Tier1 weekly operations castor 10/2/2017"

From GridPP Wiki
Jump to: navigation, search
(Actions)
(Operations news)
 
(4 intermediate revisions by one user not shown)
Line 10: Line 10:
  
 
   1. SL7 upgrade on tape servers
 
   1. SL7 upgrade on tape servers
   2. SRM upgrade to SL6
+
   2. SRM upgrade to SL6/CASTOR 2.1.16
  
 
5. Special topics
 
5. Special topics
Line 36: Line 36:
 
== Operations news ==  
 
== Operations news ==  
  
CVE2016-7117 patch applied on CASTOR NS servers
+
CVE2016-7117 patch applied on CASTOR NS servers [https://helpdesk.gridpp.rl.ac.uk/Ticket/Display.html?id=185616 RT185616]
 +
 
 +
One of the CMS SAM test failures related to xroot is fixed [https://elog.gridpp.rl.ac.uk/Tier1/5331 elog-5331]
  
 
== Plans for next week ==  
 
== Plans for next week ==  
Line 48: Line 50:
 
CIP migration to aquilon and upgrade to SL6
 
CIP migration to aquilon and upgrade to SL6
  
SRM upgrade to SL6
+
SRM upgrade to SL6/CASTOR 2.1.16
  
 
Tape-server migration to aquilon and SL7 upgrade
 
Tape-server migration to aquilon and SL7 upgrade
Line 56: Line 58:
 
Drain 10% of the 13 generation of disk servers (lhcbDst) for decommissioning
 
Drain 10% of the 13 generation of disk servers (lhcbDst) for decommissioning
  
GP and RA to communicate the open DB cursors problem to CERN
+
GP and AS to communicate the open DB cursors problem to CERN
  
 
Add GP to the mail of CASTOR overwatch script
 
Add GP to the mail of CASTOR overwatch script

Latest revision as of 12:18, 13 February 2017

Draft agenda

1. Problems encountered this week

2. Upgrades/improvements made this week

3. What are we planning to do next week?

4. Long-term project updates (if not already covered)

 1. SL7 upgrade on tape servers
 2. SRM upgrade to SL6/CASTOR 2.1.16

5. Special topics

6. Actions

7. Anything for CASTOR-Fabric?

8. AoTechnicalB

9. Availability for next week

10. On-Call

11. AoOtherB

Operations problems

"ORA-01000: maximum open cursors exceeded" in ATLAS and callout because of failure of functional tests RT185681

Failure of SRM cms-disk sam tests

Problems after CASTOR 2.1.15 upgrade on facilities

Operations news

CVE2016-7117 patch applied on CASTOR NS servers RT185616

One of the CMS SAM test failures related to xroot is fixed elog-5331

Plans for next week

Press on with SRM upgrade to SL6

Continue re-writting CIP

Long-term projects

CIP migration to aquilon and upgrade to SL6

SRM upgrade to SL6/CASTOR 2.1.16

Tape-server migration to aquilon and SL7 upgrade

Actions

Drain 10% of the 13 generation of disk servers (lhcbDst) for decommissioning

GP and AS to communicate the open DB cursors problem to CERN

Add GP to the mail of CASTOR overwatch script

Search the logs from SAM tests and hack into working

Staffing

All in

GP on call next week