Difference between revisions of "RAL Tier1 weekly operations castor 23/11/2018"

Latest revision as of 10:54, 23 November 2018

Standing agenda

1. Problems encountered this week

2. Upgrades/improvements made this week

3. What are we planning to do next week?

4. Long-term project updates (if not already covered)

5. Special topics

6. Actions

7. Review Fabric tasks

  1.   Link

8. AoTechnicalB

9. Availability for next week

10. On-Call

11. AoOtherB

Operation problems

Operation news

 * Neptune and Pluto DB patching completed on Tue

 * Continue with deleting CMS files on cmsDisk

 * Recovery of more na62 files

Plans for next few weeks

  * Proceed with the cmsDisk decommissioning

  * Decommission xrootd-cms-manager

  * Decommission ATLAS headnodes

  * Complete kernel patching on CASTOR hosts

  * Oracle/kernel patching for CASTOR Facilities DB

  * Deploy new disk servers for Facilities

Long-term projects

  * New CASTOR WLCGTape instance. Things need doing: Create a seperate xrootd redirector for ALICE

  * CASTOR disk server migration to Aquilon: gdss742 has been compiled with a draft aquilon profile
    but there are problems with the SL7 installation RT216885

Actions

Staffing

  * RA out until 10/12

@@ Line 25: / Line 25: @@
 == Operation problems ==
-   * gdss736 (lhcbDst) crashed and removed from prod; back again
-   * /etc/cron.d/check_tape_pools.ncm-cron.cron file was missing from the WLCGTape headnodes and as a result was the tape pools were not
-     topped up with free tapes and a large backlog of ATLAS canbemigrs was created [https://helpdesk.gridpp.rl.ac.uk/Ticket/Display.html?id=218153 RT218153].
-     This was fixed on aquilon on Mon 12/11 and the backlog is now cleatring
 == Operation news ==
-  * Decommissioned all disk servers from ATLAS atlasStripInput and atlasTape
-   * Moved all needed disk servers from atlasTape to wlcgTape (gdss893, gdss894, gdss895)
+   * Neptune and Pluto DB patching completed on Tue
-   * Allocated lcgcts27 and lcgcts28 to WLCGTape
+   * Continue with deleting CMS files on cmsDisk
-   * Migration of the Gen VOs (except Alice) to WLCGTape
+   * Recovery of more na62 files
-  * fdsdss20 and fdsdss21 were removed from Facilities facD0T1 pool and decommissioned
 == Plans for next few weeks ==
@@ Line 49: / Line 39: @@
     * Decommission xrootd-cms-manager
+   * Decommission ATLAS headnodes
+   * Complete kernel patching on CASTOR hosts
+   * Oracle/kernel patching for CASTOR Facilities DB
+   * Deploy new disk servers for Facilities
 == Long-term projects ==
@@ Line 62: / Line 60: @@
 == Staffing ==
-    * RA out from Thu 22/11
+    * RA out until 10/12

Difference between revisions of "RAL Tier1 weekly operations castor 23/11/2018"

Latest revision as of 10:54, 23 November 2018

Contents

Standing agenda

Operation problems

Operation news

Plans for next few weeks

Long-term projects

Actions

Staffing

Navigation menu

Personal tools

Namespaces

Variants

Views

Actions

Search

Main GridPP website

Navigation

Tools