Difference between revisions of "RAL Tier1 weekly operations castor 16/11/2018"
From GridPP Wiki
(→Operation news) |
(→Operation news) |
||
(12 intermediate revisions by one user not shown) | |||
Line 26: | Line 26: | ||
== Operation problems == | == Operation problems == | ||
− | gdss736 (lhcbDst) crashed and removed from prod; back again | + | * gdss736 (lhcbDst) crashed and removed from prod; back again |
− | /etc/cron.d/check_tape_pools.ncm-cron.cron file was missing from the WLCGTape headnodes and as a result was the tape pools were not topped up with free tapes and a large backlog of ATLAS canbemigrs was created [https://helpdesk.gridpp.rl.ac.uk/Ticket/Display.html?id=218153 RT218153]. This was | + | * /etc/cron.d/check_tape_pools.ncm-cron.cron file was missing from the WLCGTape headnodes and as a result was the tape pools were not |
+ | topped up with free tapes and a large backlog of ATLAS canbemigrs was created [https://helpdesk.gridpp.rl.ac.uk/Ticket/Display.html?id=218153 RT218153]. | ||
+ | This was fixed on aquilon on Mon 12/11 and the backlog is now cleatring | ||
== Operation news == | == Operation news == | ||
Line 36: | Line 38: | ||
* Moved all needed disk servers from atlasTape to wlcgTape (gdss893, gdss894, gdss895) | * Moved all needed disk servers from atlasTape to wlcgTape (gdss893, gdss894, gdss895) | ||
− | * | + | * Allocated lcgcts27 and lcgcts28 to WLCGTape |
− | + | * Migration of the Gen VOs (except Alice) to WLCGTape | |
− | + | * fdsdss20 and fdsdss21 were removed from Facilities facD0T1 pool and decommissioned | |
− | + | == Plans for next few weeks == | |
− | + | ||
− | + | ||
* Proceed with the cmsDisk decommissioning | * Proceed with the cmsDisk decommissioning | ||
Line 53: | Line 53: | ||
* New CASTOR WLCGTape instance. Things need doing: Create a seperate xrootd redirector for ALICE | * New CASTOR WLCGTape instance. Things need doing: Create a seperate xrootd redirector for ALICE | ||
+ | |||
+ | * CASTOR disk server migration to Aquilon: gdss742 has been compiled with a draft aquilon profile | ||
+ | but there are problems with the SL7 installation [https://helpdesk.gridpp.rl.ac.uk/Ticket/Display.html?id=216885 RT216885 ] | ||
== Actions == | == Actions == | ||
Line 59: | Line 62: | ||
== Staffing == | == Staffing == | ||
− | * RA out | + | * RA out from Thu 22/11 |
− | + | ||
− | + |
Latest revision as of 14:54, 16 November 2018
Contents
Standing agenda
1. Problems encountered this week
2. Upgrades/improvements made this week
3. What are we planning to do next week?
4. Long-term project updates (if not already covered)
5. Special topics
6. Actions
7. Review Fabric tasks
1. Link
8. AoTechnicalB
9. Availability for next week
10. On-Call
11. AoOtherB
Operation problems
* gdss736 (lhcbDst) crashed and removed from prod; back again
* /etc/cron.d/check_tape_pools.ncm-cron.cron file was missing from the WLCGTape headnodes and as a result was the tape pools were not topped up with free tapes and a large backlog of ATLAS canbemigrs was created RT218153. This was fixed on aquilon on Mon 12/11 and the backlog is now cleatring
Operation news
* Decommissioned all disk servers from ATLAS atlasStripInput and atlasTape
* Moved all needed disk servers from atlasTape to wlcgTape (gdss893, gdss894, gdss895)
* Allocated lcgcts27 and lcgcts28 to WLCGTape
* Migration of the Gen VOs (except Alice) to WLCGTape
* fdsdss20 and fdsdss21 were removed from Facilities facD0T1 pool and decommissioned
Plans for next few weeks
* Proceed with the cmsDisk decommissioning
* Decommission xrootd-cms-manager
Long-term projects
* New CASTOR WLCGTape instance. Things need doing: Create a seperate xrootd redirector for ALICE
* CASTOR disk server migration to Aquilon: gdss742 has been compiled with a draft aquilon profile but there are problems with the SL7 installation RT216885
Actions
Staffing
* RA out from Thu 22/11