Difference between revisions of "Tier1 Operations Report 2019-09-04"

From GridPP Wiki
Jump to: navigation, search
(Created page with "==RAL Tier1 Operations Report for 04th September 2019== __NOTOC__ ====== ====== <!-- ************************************************************* -----> <!-- ***********Sta...")
 
()
Line 127: Line 127:
 
| style="background-color: #b7f1ce; border-bottom: 1px solid silver; text-align: center; font-size: 1em; font-weight: bold; margin-top: 0; margin-bottom: 0; padding-top: 0.1em; padding-bottom: 0.1em;" | Open GGUS Tickets  
 
| style="background-color: #b7f1ce; border-bottom: 1px solid silver; text-align: center; font-size: 1em; font-weight: bold; margin-top: 0; margin-bottom: 0; padding-top: 0.1em; padding-bottom: 0.1em;" | Open GGUS Tickets  
 
|}
 
|}
 +
  
 
{| border=1 align=center
 
{| border=1 align=center
Line 141: Line 142:
 
! Scope
 
! Scope
 
|-
 
|-
| 142782
+
| 142981
| TEAM
+
| USER
| lhcb
+
| mice
 
| RAL-LCG2
 
| RAL-LCG2
| very urgent
+
| less urgent
 
| NGI_UK
 
| NGI_UK
| waiting for reply
+
| in progress
| 2019-08-21 10:02:00
+
| 2019-09-03 13:00:00
| FTS3 transfers Failed to RAL-RDST at RAL-LCG2
+
| mice; LFC to DFC transition
| WLCG
+
| EGI
 
|-
 
|-
| 142710
+
| 142955
| TEAM
+
| USER
| lhcb
+
| ops
 
| RAL-LCG2
 
| RAL-LCG2
| very urgent
+
| less urgent
 
| NGI_UK
 
| NGI_UK
 
| in progress
 
| in progress
| 2019-08-19 08:57:00
+
| 2019-09-02 10:26:00
| Staging problems
+
| [Rod Dashboard] Issues detected at RAL-LCG2
| WLCG
+
| EGI
 +
|-
 +
| 142835
 +
| USER
 +
| snoplus.snolab.ca
 +
| RAL-LCG2
 +
| less urgent
 +
| NGI_UK
 +
| waiting for reply
 +
| 2019-08-30 09:25:00
 +
| Connection Issues
 +
| EGI
 
|-
 
|-
 
| 142689
 
| 142689
Line 167: Line 179:
 
| cms
 
| cms
 
| RAL-LCG2
 
| RAL-LCG2
| urgent
+
| very urgent
 
| NGI_UK
 
| NGI_UK
 
| in progress
 
| in progress
| 2019-08-19 18:23:00
+
| 2019-09-02 17:22:00
 
| Transfer failing to RAL_Disk
 
| Transfer failing to RAL_Disk
 
| WLCG
 
| WLCG
Line 181: Line 193:
 
| NGI_UK
 
| NGI_UK
 
| in progress
 
| in progress
| 2019-08-14 09:03:00
+
| 2019-09-03 12:41:00
 
| Proble accessing some LHCb files at RAL
 
| Proble accessing some LHCb files at RAL
 
| WLCG
 
| WLCG
Line 192: Line 204:
 
| NGI_UK
 
| NGI_UK
 
| on hold
 
| on hold
| 2019-07-10 13:41:00
+
| 2019-08-22 10:04:00
 
| packet loss outbound from RAL-LCG2 over IPv6
 
| packet loss outbound from RAL-LCG2 over IPv6
 
| EGI
 
| EGI
 
|}
 
|}
 +
 +
 +
  
  

Revision as of 10:06, 4 September 2019

RAL Tier1 Operations Report for 04th September 2019

Review of Issues during the week 25th July2019 to the 31st July 2019.
  • ATLAS and LHCb keeping robot busy with recall campaigns.
  • Power station demolition caused cloud resources to be offline, grid hardware ok.
  • lsst jobs runnning after fixing unexpected change of voms certificate.


Current operational status and issues
Notable Changes made since the last meeting.
  • NTR
Entries in GOC DB starting since the last report.
Service ID Scheduled? Outage/At Risk Start End Duration Reason
Declared in the GOC DB
Service ID Scheduled? Outage/At Risk Start End Duration Reason
- - - - - - - -
  • No ongoing downtime
Advanced warning for other interventions
The following items are being discussed and are still to be formally scheduled and announced.


Listing by category:

  • DNS servers will be rolled out within the Tier1 network.
Open GGUS Tickets


Ticket-ID Type VO Site Priority Responsible Unit Status Last Update Subject Scope
142981 USER mice RAL-LCG2 less urgent NGI_UK in progress 2019-09-03 13:00:00 mice; LFC to DFC transition EGI
142955 USER ops RAL-LCG2 less urgent NGI_UK in progress 2019-09-02 10:26:00 [Rod Dashboard] Issues detected at RAL-LCG2 EGI
142835 USER snoplus.snolab.ca RAL-LCG2 less urgent NGI_UK waiting for reply 2019-08-30 09:25:00 Connection Issues EGI
142689 USER cms RAL-LCG2 very urgent NGI_UK in progress 2019-09-02 17:22:00 Transfer failing to RAL_Disk WLCG
142350 TEAM lhcb RAL-LCG2 top priority NGI_UK in progress 2019-09-03 12:41:00 Proble accessing some LHCb files at RAL WLCG
140447 USER dteam RAL-LCG2 less urgent NGI_UK on hold 2019-08-22 10:04:00 packet loss outbound from RAL-LCG2 over IPv6 EGI




GGUS Tickets Closed Last week
Ticket-ID Type VO Site Priority Responsible Unit Status Last Update Subject Scope
142751 USER snoplus.snolab.ca RAL-LCG2 top priority NGI_UK solved 2019-08-21 08:39:00 Data transfer failure and proxy issue EGI
142694 TEAM atlas RAL-LCG2 urgent NGI_UK solved 2019-08-14 09:10:00 RAL-LCG2 transfer errors at source WLCG
142665 USER cms RAL-LCG2 urgent NGI_UK solved 2019-08-14 09:32:00 Failing to transfer few files to RAL_Disk from CERN WLCG
142520 USER cms RAL-LCG2 urgent NGI_UK closed 2019-08-14 23:59:00 T1_UK_RAL is failing SAM tests WLCG
142337 TEAM lhcb RAL-LCG2 very urgent NGI_UK verified 2019-08-14 15:10:00 Pilots Failed at RAL-LCG2 WLCG
142203 TEAM atlas RAL-LCG2 urgent NGI_UK closed 2019-08-14 23:59:00 RAL-LCG2_MCORE jobs failing WLCG
140220 USER mice RAL-LCG2 less urgent NGI_UK solved 2019-08-14 19:09:00 mice LFC to DFC transition EGI


Availability Report

Day Atlas CMS LHCB Alice Comments
2019-08-14 100 99 100 100
2019-08-15 100 100 67 68
2019-08-16 100 100 48 48
2019-08-17 100 100 100 100
2019-08-18 100 100 100 100
2019-08-19 100 100 100 100
2019-08-20 100 100 100 100
Hammercloud Test Report
Target Availability for each site is 97.0%
Day Atlas HC CMS HC Comment
2019-08-14 100 99
2019-08-15 100 99
2019-08-16 100 98
2019-08-17 100 99
2019-08-18 100 98
2019-08-19 100 100
2019-08-20 0 100

Key: Atlas HC = Atlas HammerCloud (Queue RAL-LCG2_UCORE, Template 841); CMS HC = CMS HammerCloud

Notes from Meeting.