Difference between revisions of "RAL Tier1 weekly operations castor 09/11/2018"

From GridPP Wiki
Jump to: navigation, search
(Created page with "== Standing agenda == 1. Problems encountered this week 2. Upgrades/improvements made this week 3. What are we planning to do next week? 4. Long-term project updates (if n...")
 
 
Line 25: Line 25:
  
 
== Operation problems ==
 
== Operation problems ==
<pre>
+
 
  * NRPE RPMs not installed on ERIS nodes. Currently with John, ticket: https://helpdesk.gridpp.rl.ac.uk/Ticket/Display.html?id=211056
+
Massive recall in Facilities caused conjestion; resolved by just waiting
  * castor-functional-test1 is still running tests that it shouldn't and called out on Thursday night. Genius John identified the problem (still running tests against decommissioned disk pools), RA to implement a permanent fix.
+
</pre>
+
  
 
== Operation news ==
 
== Operation news ==
 +
 
 +
  * na62 has moved to WLCGTape
  
  * wlcgTape is in prod.
+
  * Repack upgraded to Sl7/2.1.17-35
+
  * Major cleanup of CMS files from cmsDisk (at the request of the VO)
+
 
+
  * GSI authentication for xrootd in production (mon 29/10)
+
 
+
  * na62 recovery operation ongoing using wlcgTape.
+
  
 
== Plans for next few weeks ==
 
== Plans for next few weeks ==
Line 45: Line 39:
  
 
   * Move all needed disk servers from ATLAS d0t1 to wlcgTape (gdss893, gdss894, gdss895)
 
   * Move all needed disk servers from ATLAS d0t1 to wlcgTape (gdss893, gdss894, gdss895)
 +
 +
  * Move the rest of the Gen VOs to WLCGTape
  
 
   * Proceed with the cmsDisk decommissioning
 
   * Proceed with the cmsDisk decommissioning
 +
 +
  * Decommission xrootd-cms-manager
  
 
== Long-term projects ==
 
== Long-term projects ==
Line 54: Line 52:
 
== Actions ==
 
== Actions ==
  
  * Ask TimA about whether remaining dark data can be deleted and whether ATLAS still needs the cinstancedlf alias
 
  
 
== Staffing ==
 
== Staffing ==
  
   * GP out until Friday.
+
   * RA out until a week Monday.
  
   * RA on call until Friday (then off to the US)
+
   * GP on call

Latest revision as of 10:59, 9 November 2018

Standing agenda

1. Problems encountered this week

2. Upgrades/improvements made this week

3. What are we planning to do next week?

4. Long-term project updates (if not already covered)

5. Special topics

6. Actions

7. Review Fabric tasks

  1.   Link

8. AoTechnicalB

9. Availability for next week

10. On-Call

11. AoOtherB

Operation problems

Massive recall in Facilities caused conjestion; resolved by just waiting

Operation news

 * na62 has moved to WLCGTape
 * Repack upgraded to Sl7/2.1.17-35

Plans for next few weeks

  * Decommission disk servers from ATLAS d1t0.
  * Move all needed disk servers from ATLAS d0t1 to wlcgTape (gdss893, gdss894, gdss895)
  * Move the rest of the Gen VOs to WLCGTape
  * Proceed with the cmsDisk decommissioning
  * Decommission xrootd-cms-manager

Long-term projects

  * New CASTOR WLCGTape instance. Things need doing: Create a seperate xrootd redirector for ALICE

Actions

Staffing

  * RA out until a week Monday.
  * GP on call