Difference between revisions of "Tier1 Operations Report 2019-11-06"
From GridPP Wiki
(Created page with "==RAL Tier1 Operations Report for 6th November 2019== __NOTOC__ ====== ====== <!-- ************************************************************* -----> <!-- ***********Start...") |
(→) |
||
(6 intermediate revisions by one user not shown) | |||
Line 10: | Line 10: | ||
| style="background-color: #b7f1ce; border-bottom: 1px solid silver; text-align: center; font-size: 1em; font-weight: bold; margin-top: 0; margin-bottom: 0; padding-top: 0.1em; padding-bottom: 0.1em;" | Review of Issues during the week 30th October 2019 to the 5th November 2019. | | style="background-color: #b7f1ce; border-bottom: 1px solid silver; text-align: center; font-size: 1em; font-weight: bold; margin-top: 0; margin-bottom: 0; padding-top: 0.1em; padding-bottom: 0.1em;" | Review of Issues during the week 30th October 2019 to the 5th November 2019. | ||
|} | |} | ||
− | * | + | * FTS failures for CMS to srm-cms.gridpp.rl.ac.uk using FTS due to "//" is actually an issue with rucio testing. KE investigating. |
− | + | ||
− | + | ||
− | + | ||
<!-- ***********End Review of Issues during last week*********** -----> | <!-- ***********End Review of Issues during last week*********** -----> | ||
Line 142: | Line 139: | ||
! Subject | ! Subject | ||
! Scope | ! Scope | ||
+ | |- | ||
+ | | 143916 | ||
+ | | USER | ||
+ | | cms | ||
+ | | RAL-LCG2 | ||
+ | | urgent | ||
+ | | NGI_UK | ||
+ | | in progress | ||
+ | | 2019-11-05 07:39:00 | ||
+ | | Transfers failing to T1_UK_RAL_Disk | ||
+ | | WLCG | ||
|- | |- | ||
| 143767 | | 143767 | ||
Line 150: | Line 158: | ||
| NGI_UK | | NGI_UK | ||
| in progress | | in progress | ||
− | | 2019- | + | | 2019-11-01 14:48:00 |
| FIle read issues for Workflows where data is located at T1_UK_RAL | | FIle read issues for Workflows where data is located at T1_UK_RAL | ||
| WLCG | | WLCG | ||
Line 182: | Line 190: | ||
| top priority | | top priority | ||
| NGI_UK | | NGI_UK | ||
− | | | + | | on hold |
− | | 2019-10- | + | | 2019-10-30 14:42:00 |
| Jobs Failed to access files at RAL-LCG2 | | Jobs Failed to access files at RAL-LCG2 | ||
| WLCG | | WLCG | ||
Line 193: | Line 201: | ||
| top priority | | top priority | ||
| NGI_UK | | NGI_UK | ||
− | | | + | | on hold |
− | | 2019-10-14 | + | | 2019-10-30 14:43:00 |
| File deletion at RAL ECHO | | File deletion at RAL ECHO | ||
| WLCG | | WLCG | ||
Line 204: | Line 212: | ||
| top priority | | top priority | ||
| NGI_UK | | NGI_UK | ||
− | | | + | | on hold |
− | | 2019-10- | + | | 2019-10-30 14:44:00 |
| Proble accessing some LHCb files at RAL | | Proble accessing some LHCb files at RAL | ||
| WLCG | | WLCG | ||
Line 238: | Line 246: | ||
! Scope | ! Scope | ||
|- | |- | ||
− | | | + | | 143917 |
− | | | + | | USER |
− | | | + | | cms |
| RAL-LCG2 | | RAL-LCG2 | ||
− | | | + | | urgent |
| NGI_UK | | NGI_UK | ||
| solved | | solved | ||
− | | 2019- | + | | 2019-11-04 17:08:00 |
− | | | + | | Transfers failing to T1_UK_RAL_Disk |
| WLCG | | WLCG | ||
|- | |- | ||
− | | | + | | 143876 |
| USER | | USER | ||
| cms | | cms | ||
Line 256: | Line 264: | ||
| NGI_UK | | NGI_UK | ||
| solved | | solved | ||
− | | 2019- | + | | 2019-11-01 14:46:00 |
− | | | + | | T1_UK_RAL HammerCloud cannot reach files via xrootd |
| WLCG | | WLCG | ||
|- | |- | ||
− | | | + | | 143874 |
− | | | + | | USER |
− | | | + | | ops |
| RAL-LCG2 | | RAL-LCG2 | ||
− | | | + | | less urgent |
| NGI_UK | | NGI_UK | ||
− | | | + | | verified |
− | | 2019- | + | | 2019-11-01 12:31:00 |
− | | | + | | [Rod Dashboard] Issue detected : org.nagios.BDII-Check@lcgbdii.gridpp.rl.ac.uk |
− | | | + | | EGI |
|- | |- | ||
− | | | + | | 143869 |
| TEAM | | TEAM | ||
| lhcb | | lhcb | ||
Line 278: | Line 286: | ||
| NGI_UK | | NGI_UK | ||
| verified | | verified | ||
− | | 2019- | + | | 2019-11-06 11:31:00 |
− | | | + | | (again) file transfers low efficiency |
| WLCG | | WLCG | ||
|- | |- | ||
− | | | + | | 143838 |
− | | | + | | TEAM |
− | | | + | | atlas |
| RAL-LCG2 | | RAL-LCG2 | ||
− | | urgent | + | | less urgent |
| NGI_UK | | NGI_UK | ||
| solved | | solved | ||
− | | 2019- | + | | 2019-11-01 11:17:00 |
− | | | + | | RAL-LCG2: TRANSFER an end-of-file was reached globus_xio: An end of file occurred |
− | | | + | | WLCG |
|- | |- | ||
− | | | + | | 143834 |
| USER | | USER | ||
| cms | | cms | ||
Line 300: | Line 308: | ||
| NGI_UK | | NGI_UK | ||
| solved | | solved | ||
− | | 2019-10- | + | | 2019-10-30 11:48:00 |
− | | | + | | transfers failing to T1_UK_RAL_Disk |
| WLCG | | WLCG | ||
|- | |- | ||
− | | | + | | 143831 |
| TEAM | | TEAM | ||
− | | | + | | lhcb |
| RAL-LCG2 | | RAL-LCG2 | ||
− | | | + | | very urgent |
| NGI_UK | | NGI_UK | ||
− | | | + | | verified |
− | | 2019-10- | + | | 2019-10-30 12:28:00 |
− | | | + | | low efficiency at gsiftp://gridftp.echo.stfc.ac.uk |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
| WLCG | | WLCG | ||
|} | |} | ||
− | |||
Line 354: | Line 350: | ||
! Comments | ! Comments | ||
|- | |- | ||
− | | 2019-10- | + | | 2019-10-30 |
| 100 | | 100 | ||
| 100 | | 100 | ||
Line 361: | Line 357: | ||
| | | | ||
|- | |- | ||
− | | 2019-10- | + | | 2019-10-31 |
| 100 | | 100 | ||
| 100 | | 100 | ||
Line 368: | Line 364: | ||
| | | | ||
|- | |- | ||
− | | 2019- | + | | 2019-11-01 |
| 100 | | 100 | ||
| 100 | | 100 | ||
Line 375: | Line 371: | ||
| | | | ||
|- | |- | ||
− | | 2019- | + | | 2019-11-02 |
| 100 | | 100 | ||
| 100 | | 100 | ||
Line 382: | Line 378: | ||
| | | | ||
|- | |- | ||
− | | 2019- | + | | 2019-11-03 |
| 100 | | 100 | ||
| 100 | | 100 | ||
Line 389: | Line 385: | ||
| | | | ||
|- | |- | ||
− | | 2019- | + | | 2019-11-04 |
| 100 | | 100 | ||
− | | | + | | 97 |
| 100 | | 100 | ||
| 100 | | 100 | ||
| | | | ||
|- | |- | ||
− | | 2019- | + | | 2019-11-05 |
− | + | ||
| 100 | | 100 | ||
+ | | 97 | ||
| 100 | | 100 | ||
| 100 | | 100 | ||
Line 420: | Line 416: | ||
! Day !! Atlas HC !! CMS HC !! Comment | ! Day !! Atlas HC !! CMS HC !! Comment | ||
|- | |- | ||
− | | 2019-10-30 || 100 || | + | | 2019-10-30 || 100 || 64|| |
|- | |- | ||
− | | 2019-10-31 || | + | | 2019-10-31 || 96 || 97 || |
|- | |- | ||
| 2019-10-01 || 100 || 96|| | | 2019-10-01 || 100 || 96|| | ||
|- | |- | ||
− | | 2019-10-02 || 100 || | + | | 2019-10-02 || 100 || 97 || |
|- | |- | ||
− | | 2019-10-03 || | + | | 2019-10-03 || 96 || 98|| |
|- | |- | ||
− | | 2019-10-04|| | + | | 2019-10-04|| 100|| 99|| |
|- | |- | ||
− | | 2019-10-05 || 100 || | + | | 2019-10-05 || 100 || 99 || |
|- | |- | ||
|} | |} |
Latest revision as of 12:44, 6 November 2019
RAL Tier1 Operations Report for 6th November 2019
Review of Issues during the week 30th October 2019 to the 5th November 2019. |
- FTS failures for CMS to srm-cms.gridpp.rl.ac.uk using FTS due to "//" is actually an issue with rucio testing. KE investigating.
Current operational status and issues |
Notable Changes made since the last meeting. |
- NTR
Entries in GOC DB starting since the last report. |
Service | ID | Scheduled? | Outage/At Risk | Start | End | Duration | Reason |
---|---|---|---|---|---|---|---|
Declared in the GOC DB |
Service | ID | Scheduled? | Outage/At Risk | Start | End | Duration | Reason |
---|---|---|---|---|---|---|---|
- | - | - | - | - | - | - | - |
- No ongoing downtime
Advanced warning for other interventions |
The following items are being discussed and are still to be formally scheduled and announced. |
Listing by category:
- DNS servers will be rolled out within the Tier1 network.
Open GGUS Tickets |
Ticket-ID | Type | VO | Site | Priority | Responsible Unit | Status | Last Update | Subject | Scope |
---|---|---|---|---|---|---|---|---|---|
143916 | USER | cms | RAL-LCG2 | urgent | NGI_UK | in progress | 2019-11-05 07:39:00 | Transfers failing to T1_UK_RAL_Disk | WLCG |
143767 | USER | cms | RAL-LCG2 | urgent | NGI_UK | in progress | 2019-11-01 14:48:00 | FIle read issues for Workflows where data is located at T1_UK_RAL | WLCG |
143762 | TEAM | lhcb | RAL-LCG2 | urgent | NGI_UK | in progress | 2019-10-23 14:12:00 | Stop using sl6 queues at RAL | WLCG |
143669 | USER | snoplus.snolab.ca | RAL-LCG2 | urgent | NGI_UK | in progress | 2019-10-18 14:25:00 | SNO+ LFC to DFC migration | EGI |
143645 | TEAM | lhcb | RAL-LCG2 | top priority | NGI_UK | on hold | 2019-10-30 14:42:00 | Jobs Failed to access files at RAL-LCG2 | WLCG |
143323 | TEAM | lhcb | RAL-LCG2 | top priority | NGI_UK | on hold | 2019-10-30 14:43:00 | File deletion at RAL ECHO | WLCG |
142350 | TEAM | lhcb | RAL-LCG2 | top priority | NGI_UK | on hold | 2019-10-30 14:44:00 | Proble accessing some LHCb files at RAL | WLCG |
GGUS Tickets Closed Last week |
Ticket-ID | Type | VO | Site | Priority | Responsible Unit | Status | Last Update | Subject | Scope |
---|---|---|---|---|---|---|---|---|---|
143917 | USER | cms | RAL-LCG2 | urgent | NGI_UK | solved | 2019-11-04 17:08:00 | Transfers failing to T1_UK_RAL_Disk | WLCG |
143876 | USER | cms | RAL-LCG2 | urgent | NGI_UK | solved | 2019-11-01 14:46:00 | T1_UK_RAL HammerCloud cannot reach files via xrootd | WLCG |
143874 | USER | ops | RAL-LCG2 | less urgent | NGI_UK | verified | 2019-11-01 12:31:00 | [Rod Dashboard] Issue detected : org.nagios.BDII-Check@lcgbdii.gridpp.rl.ac.uk | EGI |
143869 | TEAM | lhcb | RAL-LCG2 | very urgent | NGI_UK | verified | 2019-11-06 11:31:00 | (again) file transfers low efficiency | WLCG |
143838 | TEAM | atlas | RAL-LCG2 | less urgent | NGI_UK | solved | 2019-11-01 11:17:00 | RAL-LCG2: TRANSFER an end-of-file was reached globus_xio: An end of file occurred | WLCG |
143834 | USER | cms | RAL-LCG2 | urgent | NGI_UK | solved | 2019-10-30 11:48:00 | transfers failing to T1_UK_RAL_Disk | WLCG |
143831 | TEAM | lhcb | RAL-LCG2 | very urgent | NGI_UK | verified | 2019-10-30 12:28:00 | low efficiency at gsiftp://gridftp.echo.stfc.ac.uk | WLCG |
Availability Report |
Day | Atlas | CMS | LHCB | Alice | Comments |
---|---|---|---|---|---|
2019-10-30 | 100 | 100 | 100 | 100 | |
2019-10-31 | 100 | 100 | 100 | 100 | |
2019-11-01 | 100 | 100 | 100 | 100 | |
2019-11-02 | 100 | 100 | 100 | 100 | |
2019-11-03 | 100 | 100 | 100 | 100 | |
2019-11-04 | 100 | 97 | 100 | 100 | |
2019-11-05 | 100 | 97 | 100 | 100 |
Hammercloud Test Report |
Target Availability for each site is 97.0% |
Day | Atlas HC | CMS HC | Comment |
---|---|---|---|
2019-10-30 | 100 | 64 | |
2019-10-31 | 96 | 97 | |
2019-10-01 | 100 | 96 | |
2019-10-02 | 100 | 97 | |
2019-10-03 | 96 | 98 | |
2019-10-04 | 100 | 99 | |
2019-10-05 | 100 | 99 |
Key: Atlas HC = Atlas HammerCloud (Queue RAL-LCG2_UCORE, Template 841); CMS HC = CMS HammerCloud
Notes from Meeting. |