Difference between revisions of "Tier1 Operations Report 2020-01-29"
(→) |
(→) |
||
(5 intermediate revisions by one user not shown) | |||
Line 10: | Line 10: | ||
| style="background-color: #b7f1ce; border-bottom: 1px solid silver; text-align: center; font-size: 1em; font-weight: bold; margin-top: 0; margin-bottom: 0; padding-top: 0.1em; padding-bottom: 0.1em;" | Review of Issues during the week 23rd January 2020 to the 28th January 2020. | | style="background-color: #b7f1ce; border-bottom: 1px solid silver; text-align: center; font-size: 1em; font-weight: bold; margin-top: 0; margin-bottom: 0; padding-top: 0.1em; padding-bottom: 0.1em;" | Review of Issues during the week 23rd January 2020 to the 28th January 2020. | ||
|} | |} | ||
− | * Castor upgraded | + | * Farm drained out due to ceph@CERN having issues |
+ | * Castor upgraded succesfully. | ||
<!-- ***********End Review of Issues during last week*********** -----> | <!-- ***********End Review of Issues during last week*********** -----> | ||
<!-- *********************************************************** -----> | <!-- *********************************************************** -----> | ||
Line 68: | Line 69: | ||
|- | |- | ||
| style="background-color: #d8e8ff; border-bottom: 1px solid silver; text-align: center; font-size: 1em; font-weight: bold; margin-top: 0; margin-bottom: 0; padding-top: 0.1em; padding-bottom: 0.1em;" | Declared in the GOC DB | | style="background-color: #d8e8ff; border-bottom: 1px solid silver; text-align: center; font-size: 1em; font-weight: bold; margin-top: 0; margin-bottom: 0; padding-top: 0.1em; padding-bottom: 0.1em;" | Declared in the GOC DB | ||
− | |} | + | |} |
− | + | ||
− | + | ||
* No ongoing downtime | * No ongoing downtime | ||
Line 230: | Line 229: | ||
! Scope | ! Scope | ||
|- | |- | ||
− | | | + | | 144913 |
| TEAM | | TEAM | ||
− | | | + | | atlas |
| RAL-LCG2 | | RAL-LCG2 | ||
− | | | + | | less urgent |
| NGI_UK | | NGI_UK | ||
| solved | | solved | ||
− | | 2020-01- | + | | 2020-01-24 13:10:00 |
− | | | + | | RAL-LCG2: deletion errors |
+ | | WLCG | ||
+ | |- | ||
+ | | 144683 | ||
+ | | USER | ||
+ | | cms | ||
+ | | RAL-LCG2 | ||
+ | | urgent | ||
+ | | NGI_UK | ||
+ | | closed | ||
+ | | 2020-01-22 23:59:00 | ||
+ | | Mistake in Siteconf Fallback SE name | ||
| WLCG | | WLCG | ||
|} | |} | ||
Line 267: | Line 277: | ||
! Comments | ! Comments | ||
|- | |- | ||
− | | 2020-01- | + | | 2020-01-22 |
− | + | ||
− | + | ||
| 100 | | 100 | ||
+ | | 63 | ||
| 100 | | 100 | ||
+ | | 69 | ||
| | | | ||
|- | |- | ||
− | | 2020-01- | + | | 2020-01-23 |
| 100 | | 100 | ||
| 100 | | 100 | ||
Line 281: | Line 291: | ||
| | | | ||
|- | |- | ||
− | | 2020-01- | + | | 2020-01-24 |
| 100 | | 100 | ||
| 100 | | 100 | ||
Line 288: | Line 298: | ||
| | | | ||
|- | |- | ||
− | | 2020-01- | + | | 2020-01-25 |
| 100 | | 100 | ||
| 100 | | 100 | ||
Line 295: | Line 305: | ||
| | | | ||
|- | |- | ||
− | | 2020-01- | + | | 2020-01-26 |
| 100 | | 100 | ||
| 100 | | 100 | ||
Line 302: | Line 312: | ||
| | | | ||
|- | |- | ||
− | | 2020-01- | + | | 2020-01-27 |
| 100 | | 100 | ||
| 100 | | 100 | ||
Line 309: | Line 319: | ||
| | | | ||
|- | |- | ||
− | | 2020-01- | + | | 2020-01-28 |
− | + | ||
| 100 | | 100 | ||
| 100 | | 100 | ||
+ | | 98 | ||
| 100 | | 100 | ||
| | | | ||
Line 333: | Line 343: | ||
! Day !! Atlas HC !! CMS HC !! Comment | ! Day !! Atlas HC !! CMS HC !! Comment | ||
|- | |- | ||
− | | 2020-01- | + | | 2020-01-22 || 100 || 99|| |
|- | |- | ||
− | | 2020-01- | + | | 2020-01-23 || 90 || 99|| |
|- | |- | ||
− | | 2020-01- | + | | 2020-01-24 || 100 || 97|| |
|- | |- | ||
− | | 2020-01- | + | | 2020-01-25 || 100 || 99 || |
|- | |- | ||
− | | 2020-01- | + | | 2020-01-26 || 92 || 98|| |
|- | |- | ||
− | | 2020-01- | + | | 2020-01-27|| 100|| 98|| |
|- | |- | ||
− | | 2020-01- | + | | 2020-01-28 || 91 || 98 || |
|- | |- | ||
|} | |} |
Latest revision as of 12:57, 29 January 2020
RAL Tier1 Operations Report for 22nd January 2020
Review of Issues during the week 23rd January 2020 to the 28th January 2020. |
- Farm drained out due to ceph@CERN having issues
- Castor upgraded succesfully.
Current operational status and issues |
Notable Changes made since the last meeting. |
- NTR
Entries in GOC DB starting since the last report. |
Service | ID | Scheduled? | Outage/At Risk | Start | End | Duration | Reason |
---|---|---|---|---|---|---|---|
Declared in the GOC DB |
- No ongoing downtime
Advanced warning for other interventions |
The following items are being discussed and are still to be formally scheduled and announced. |
CVMFS downtime for physical server move. wille affect stratum 0
Open GGUS Tickets |
Ticket-ID | Type | VO | Site | Priority | Responsible Unit | Status | Last Update | Subject | Scope |
---|---|---|---|---|---|---|---|---|---|
144989 | USER | cms | RAL-LCG2 | top priority | NGI_UK | assigned | 2020-01-29 07:41:00 | All transfers are failing using UK FTS3 | WLCG |
144953 | TEAM | atlas | RAL-LCG2 | urgent | NGI_UK | in progress | 2020-01-28 12:58:00 | RAL-LCG2: unable to submit | WLCG |
144884 | TEAM | atlas | RAL-LCG2 | urgent | NGI_UK | in progress | 2020-01-24 11:08:00 | The worker was failed while the job was starting : Job submission to LRMS failed | WLCG |
144549 | USER | mice | RAL-LCG2 | less urgent | NGI_UK | in progress | 2020-01-23 17:40:00 | Additional MICE Miscellaneous data for Castor | EGI |
144431 | USER | cms | RAL-LCG2 | urgent | NGI_UK | on hold | 2020-01-22 10:42:00 | Transfers failing to RAL_Disk | WLCG |
143669 | USER | snoplus.snolab.ca | RAL-LCG2 | urgent | NGI_UK | on hold | 2019-11-18 09:13:00 | SNO+ LFC to DFC migration | EGI |
143323 | TEAM | lhcb | RAL-LCG2 | top priority | NGI_UK | on hold | 2019-12-20 12:40:00 | File deletion at RAL ECHO | WLCG |
142350 | TEAM | lhcb | RAL-LCG2 | top priority | NGI_UK | on hold | 2020-01-22 12:55:00 | Proble accessing some LHCb files at RAL | WLCG |
GGUS Tickets Closed Last week |
Ticket-ID | Type | VO | Site | Priority | Responsible Unit | Status | Last Update | Subject | Scope |
---|---|---|---|---|---|---|---|---|---|
144913 | TEAM | atlas | RAL-LCG2 | less urgent | NGI_UK | solved | 2020-01-24 13:10:00 | RAL-LCG2: deletion errors | WLCG |
144683 | USER | cms | RAL-LCG2 | urgent | NGI_UK | closed | 2020-01-22 23:59:00 | Mistake in Siteconf Fallback SE name | WLCG |
Availability Report |
Day | Atlas | CMS | LHCB | Alice | Comments |
---|---|---|---|---|---|
2020-01-22 | 100 | 63 | 100 | 69 | |
2020-01-23 | 100 | 100 | 100 | 100 | |
2020-01-24 | 100 | 100 | 100 | 100 | |
2020-01-25 | 100 | 100 | 100 | 100 | |
2020-01-26 | 100 | 100 | 100 | 100 | |
2020-01-27 | 100 | 100 | 100 | 100 | |
2020-01-28 | 100 | 100 | 98 | 100 |
Hammercloud Test Report |
Target Availability for each site is 97.0% |
Day | Atlas HC | CMS HC | Comment |
---|---|---|---|
2020-01-22 | 100 | 99 | |
2020-01-23 | 90 | 99 | |
2020-01-24 | 100 | 97 | |
2020-01-25 | 100 | 99 | |
2020-01-26 | 92 | 98 | |
2020-01-27 | 100 | 98 | |
2020-01-28 | 91 | 98 |
Key: Atlas HC = Atlas HammerCloud (Queue RAL-LCG2_UCORE, Template 841); CMS HC = CMS HammerCloud
Notes from Meeting. |
Tier-1 Liaison 15/01/2020
Attendee's: Brian, Katy, Darren, Henry, Rob, and Raja
- Possibility that CMS might not be using multi-core jobs efficiently/correctly. - Henry (MICE) has written the last data to MICE Archive. He will confirm this is all present and correct in the following week. - Henry questioned using XRD. However was informed that CASTOR users should use SRM as tape end-point. - 144549: Henry to confirm write complete next week. - 144457: Christophe to check and then close. - 144431: Placeholder ticket for Katy. - 143669: Action with Alistair. - 143323/142350: Still on-hold awaiting Echo Mimic - Tim RT’s tickets – no new ones, no progress on current ones