Difference between revisions of "Tier1 Operations Report 2019-10-09"

From GridPP Wiki
Jump to: navigation, search
(Created page with "==RAL Tier1 Operations Report for 9th October 2019== __NOTOC__ ====== ====== <!-- ************************************************************* -----> <!-- ***********Start ...")
 
()
 
(7 intermediate revisions by one user not shown)
Line 10: Line 10:
 
| style="background-color: #b7f1ce; border-bottom: 1px solid silver; text-align: center; font-size: 1em; font-weight: bold; margin-top: 0; margin-bottom: 0; padding-top: 0.1em; padding-bottom: 0.1em;" | Review of Issues during the week 2nd October 2019 to the 8th October 2019.
 
| style="background-color: #b7f1ce; border-bottom: 1px solid silver; text-align: center; font-size: 1em; font-weight: bold; margin-top: 0; margin-bottom: 0; padding-top: 0.1em; padding-bottom: 0.1em;" | Review of Issues during the week 2nd October 2019 to the 8th October 2019.
 
|}
 
|}
* IPv6 packet loss on SuperJanet solved with network intervention.
+
* New IPV6 IPv6 packet loss on SuperJanet from nonOPN sudneteed hosts. 8/10/2019
 
<!-- ***********End Review of Issues during last week*********** ----->
 
<!-- ***********End Review of Issues during last week*********** ----->
 
<!-- *********************************************************** ----->
 
<!-- *********************************************************** ----->
Line 125: Line 125:
 
| style="background-color: #b7f1ce; border-bottom: 1px solid silver; text-align: center; font-size: 1em; font-weight: bold; margin-top: 0; margin-bottom: 0; padding-top: 0.1em; padding-bottom: 0.1em;" | Open GGUS Tickets  
 
| style="background-color: #b7f1ce; border-bottom: 1px solid silver; text-align: center; font-size: 1em; font-weight: bold; margin-top: 0; margin-bottom: 0; padding-top: 0.1em; padding-bottom: 0.1em;" | Open GGUS Tickets  
 
|}
 
|}
 +
  
 
{| border=1 align=center
 
{| border=1 align=center
Line 138: Line 139:
 
! Subject
 
! Subject
 
! Scope
 
! Scope
|-
 
| 143402
 
| USER
 
| none
 
| RAL-LCG2
 
| urgent
 
| NGI_UK
 
| in progress
 
| 2019-09-30 12:23:00
 
| CVMFS IPv6 connection issues at RAL
 
| EGI
 
|-
 
| 143387
 
| USER
 
| snoplus.snolab.ca
 
| RAL-LCG2
 
| less urgent
 
| NGI_UK
 
| in progress
 
| 2019-10-01 10:29:00
 
| Transfer issues to RAL
 
| EGI
 
 
|-
 
|-
 
| 143323
 
| 143323
Line 168: Line 147:
 
| NGI_UK
 
| NGI_UK
 
| in progress
 
| in progress
| 2019-09-27 15:20:00
+
| 2019-10-08 08:05:00
 
| File deletion at RAL ECHO
 
| File deletion at RAL ECHO
 
| WLCG
 
| WLCG
Line 179: Line 158:
 
| NGI_UK
 
| NGI_UK
 
| in progress
 
| in progress
| 2019-09-18 14:09:00
+
| 2019-10-07 14:52:00
 
| Proble accessing some LHCb files at RAL
 
| Proble accessing some LHCb files at RAL
 
| WLCG
 
| WLCG
 
|}
 
|}
 
 
  
  
Line 199: Line 176:
 
| style="background-color: #b7f1ce; border-bottom: 1px solid silver; text-align: center; font-size: 1em; font-weight: bold; margin-top: 0; margin-bottom: 0; padding-top: 0.1em; padding-bottom: 0.1em;" | GGUS Tickets Closed Last week
 
| style="background-color: #b7f1ce; border-bottom: 1px solid silver; text-align: center; font-size: 1em; font-weight: bold; margin-top: 0; margin-bottom: 0; padding-top: 0.1em; padding-bottom: 0.1em;" | GGUS Tickets Closed Last week
 
|}
 
|}
 
  
 
{| border=1 align=center
 
{| border=1 align=center
Line 214: Line 190:
 
! Scope
 
! Scope
 
|-
 
|-
| 143406
+
| 143569
| USER
+
| TEAM
| cms
+
| atlas
 
| RAL-LCG2
 
| RAL-LCG2
| urgent
+
| top priority
 
| NGI_UK
 
| NGI_UK
 
| solved
 
| solved
| 2019-10-01 06:27:00
+
| 2019-10-09 11:57:00
| transfers failing to T1_UK_RAL_Disk
+
| Problem with FTS at RAL
 
| WLCG
 
| WLCG
 
|-
 
|-
| 143384
+
| 143567
 
| TEAM
 
| TEAM
| atlas
+
| lhcb
 
| RAL-LCG2
 
| RAL-LCG2
 
| very urgent
 
| very urgent
 
| NGI_UK
 
| NGI_UK
 
| solved
 
| solved
| 2019-09-25 22:08:00
+
| 2019-10-09 11:54:00
| Low efficiency of Atlas transfers to sites in UK cloud
+
| FTS3 problem for transfers executing at RAL FTS3 server
 
| WLCG
 
| WLCG
 
|-
 
|-
| 143379
+
| 143565
 
| USER
 
| USER
 
| cms
 
| cms
Line 243: Line 219:
 
| NGI_UK
 
| NGI_UK
 
| solved
 
| solved
| 2019-09-26 06:40:00
+
| 2019-10-09 11:53:00
| issues with RAL FTS?
+
| RAL FTS is Down
 
| WLCG
 
| WLCG
 
|-
 
|-
| 143225
+
| 143402
 
| USER
 
| USER
| cms
+
| none
 
| RAL-LCG2
 
| RAL-LCG2
| very urgent
+
| urgent
 
| NGI_UK
 
| NGI_UK
| verified
+
| solved
| 2019-09-25 06:04:00
+
| 2019-10-09 11:52:00
| some of RAL FTS servers are not running?
+
| CVMFS IPv6 connection issues at RAL
| WLCG
+
| EGI
 
|-
 
|-
| 143198
+
| 143387
 
| USER
 
| USER
| cms
+
| snoplus.snolab.ca
 
| RAL-LCG2
 
| RAL-LCG2
| urgent
+
| less urgent
 +
| NGI_UK
 +
| solved
 +
| 2019-10-04 12:35:00
 +
| Transfer issues to RAL
 +
| EGI
 +
|-
 +
| 143324
 +
| TEAM
 +
| lhcb
 +
| RAL-LCG2
 +
| very urgent
 
| NGI_UK
 
| NGI_UK
 
| closed
 
| closed
| 2019-09-27 23:59:00
+
| 2019-10-04 23:59:00
| issues with RAL FTS?
+
| File recreation canceled since the file cannot be routed to tape
 
| WLCG
 
| WLCG
 
|-
 
|-
| 142689
+
| 143231
 
| USER
 
| USER
| cms
+
| other
 
| RAL-LCG2
 
| RAL-LCG2
| very urgent
+
| urgent
 +
| EGI CVMFS Service
 +
| closed
 +
| 2019-10-04 23:59:00
 +
| CVMFS repo dirac.egi.eu updates are not propagated
 +
| EGI
 +
|-
 +
| 143218
 +
| TEAM
 +
| lhcb
 +
| RAL-LCG2
 +
| urgent
 
| NGI_UK
 
| NGI_UK
| solved
+
| closed
| 2019-10-01 15:33:00
+
| 2019-10-08 23:59:00
| Transfer failing to RAL_Disk
+
| FTS3 transfers problem to GRIDKA for transfers executing at RAL FTS3 server
 
| WLCG
 
| WLCG
 
|-
 
|-
| 140447
+
| 142835
 
| USER
 
| USER
| dteam
+
| snoplus.snolab.ca
 
| RAL-LCG2
 
| RAL-LCG2
 
| less urgent
 
| less urgent
 
| NGI_UK
 
| NGI_UK
| solved
+
| closed
| 2019-09-27 08:34:00
+
| 2019-10-02 23:59:00
| packet loss outbound from RAL-LCG2 over IPv6
+
| Connection Issues
 
| EGI
 
| EGI
 
|}
 
|}
Line 305: Line 303:
 
Availability Report
 
Availability Report
 
|}
 
|}
 +
 
{| border=1 align=center
 
{| border=1 align=center
 
|- bgcolor="#7c8aaf"
 
|- bgcolor="#7c8aaf"
Line 314: Line 313:
 
! Comments
 
! Comments
 
|-
 
|-
| 2019-09-25
+
| 2019-10-02
 +
| 100
 +
| 100
 
| 100
 
| 100
| 87
+
| 96
| 92
+
| 81
+
 
|  
 
|  
 
|-
 
|-
| 2019-09-26
+
| 2019-10-03
 
| 100
 
| 100
 
| 100
 
| 100
Line 328: Line 327:
 
|  
 
|  
 
|-
 
|-
| 2019-09-27
+
| 2019-10-04
 
| 100
 
| 100
 
| 100
 
| 100
Line 335: Line 334:
 
|  
 
|  
 
|-
 
|-
| 2019-09-28
+
| 2019-10-05
 
| 100
 
| 100
 
| 100
 
| 100
Line 342: Line 341:
 
|  
 
|  
 
|-
 
|-
| 2019-09-29
+
| 2019-10-06
 
| 100
 
| 100
 
| 100
 
| 100
Line 349: Line 348:
 
|  
 
|  
 
|-
 
|-
| 2019-09-30
+
| 2019-10-07
 
| 100
 
| 100
 
| 100
 
| 100
Line 356: Line 355:
 
|  
 
|  
 
|-
 
|-
| 2019-10-01
+
| 2019-10-08
 
| 100
 
| 100
 
| 100
 
| 100
Line 380: Line 379:
 
! Day !! Atlas HC !! CMS HC !! Comment
 
! Day !! Atlas HC !! CMS HC !! Comment
 
|-
 
|-
| 2019-09-25 || 89 || 100 ||  
+
| 2019-10-02 || 100 || 100 ||  
 
|-
 
|-
| 2019-09-26 || 100 || 100 ||  
+
| 2019-10-03 || 100 || 100 ||  
 
|-
 
|-
| 2019-09-27 || 89 || 100 ||  
+
| 2019-10-04 || 100 || 99 ||  
 
|-
 
|-
| 2019-09-28 || 100 || 100 ||  
+
| 2019-10-05 || 100 || 99 ||  
 
|-
 
|-
| 2019-09-29 || 100 || 100||  
+
| 2019-10-06 || 100 || 100||  
 
|-
 
|-
| 2019-09-30|| 92|| 100||  
+
| 2019-10-08|| 100|| 100||  
 
|-
 
|-
| 2019-10-01 || 100 || 100 ||  
+
| 2019-10-09 || 100 || 100 ||  
 
|-
 
|-
 
|}  
 
|}  

Latest revision as of 11:58, 9 October 2019

RAL Tier1 Operations Report for 9th October 2019

Review of Issues during the week 2nd October 2019 to the 8th October 2019.
  • New IPV6 IPv6 packet loss on SuperJanet from nonOPN sudneteed hosts. 8/10/2019
Current operational status and issues
Notable Changes made since the last meeting.
  • NTR
Entries in GOC DB starting since the last report.
Service ID Scheduled? Outage/At Risk Start End Duration Reason
Declared in the GOC DB
Service ID Scheduled? Outage/At Risk Start End Duration Reason
- - - - - - - -
  • No ongoing downtime
Advanced warning for other interventions
The following items are being discussed and are still to be formally scheduled and announced.


Listing by category:

  • DNS servers will be rolled out within the Tier1 network.
Open GGUS Tickets


Ticket-ID Type VO Site Priority Responsible Unit Status Last Update Subject Scope
143323 TEAM lhcb RAL-LCG2 top priority NGI_UK in progress 2019-10-08 08:05:00 File deletion at RAL ECHO WLCG
142350 TEAM lhcb RAL-LCG2 top priority NGI_UK in progress 2019-10-07 14:52:00 Proble accessing some LHCb files at RAL WLCG


GGUS Tickets Closed Last week
Ticket-ID Type VO Site Priority Responsible Unit Status Last Update Subject Scope
143569 TEAM atlas RAL-LCG2 top priority NGI_UK solved 2019-10-09 11:57:00 Problem with FTS at RAL WLCG
143567 TEAM lhcb RAL-LCG2 very urgent NGI_UK solved 2019-10-09 11:54:00 FTS3 problem for transfers executing at RAL FTS3 server WLCG
143565 USER cms RAL-LCG2 urgent NGI_UK solved 2019-10-09 11:53:00 RAL FTS is Down WLCG
143402 USER none RAL-LCG2 urgent NGI_UK solved 2019-10-09 11:52:00 CVMFS IPv6 connection issues at RAL EGI
143387 USER snoplus.snolab.ca RAL-LCG2 less urgent NGI_UK solved 2019-10-04 12:35:00 Transfer issues to RAL EGI
143324 TEAM lhcb RAL-LCG2 very urgent NGI_UK closed 2019-10-04 23:59:00 File recreation canceled since the file cannot be routed to tape WLCG
143231 USER other RAL-LCG2 urgent EGI CVMFS Service closed 2019-10-04 23:59:00 CVMFS repo dirac.egi.eu updates are not propagated EGI
143218 TEAM lhcb RAL-LCG2 urgent NGI_UK closed 2019-10-08 23:59:00 FTS3 transfers problem to GRIDKA for transfers executing at RAL FTS3 server WLCG
142835 USER snoplus.snolab.ca RAL-LCG2 less urgent NGI_UK closed 2019-10-02 23:59:00 Connection Issues EGI


Availability Report

Day Atlas CMS LHCB Alice Comments
2019-10-02 100 100 100 96
2019-10-03 100 100 100 100
2019-10-04 100 100 100 100
2019-10-05 100 100 100 100
2019-10-06 100 100 100 100
2019-10-07 100 100 100 100
2019-10-08 100 100 100 100
Hammercloud Test Report
Target Availability for each site is 97.0%
Day Atlas HC CMS HC Comment
2019-10-02 100 100
2019-10-03 100 100
2019-10-04 100 99
2019-10-05 100 99
2019-10-06 100 100
2019-10-08 100 100
2019-10-09 100 100

Key: Atlas HC = Atlas HammerCloud (Queue RAL-LCG2_UCORE, Template 841); CMS HC = CMS HammerCloud

Notes from Meeting.