Past Ticket Bulletins
Monday 4th February 2019, 14.30 GMT
41 Open UK Tickets this month.
NGI
139506 (4/2)
The NGI got a ticket regarding Birmingham's availability figures, which are thrown by the decommissioning of their SE. We need to formulate a reponse, but we should perhaps ask for an A/R recomputation for January for the site. Assigned (4/2)
OXFORD
139431 (30/1)
A request from CMS to updated the site's site-local-config. Being looked at. In progress (31/1)
138647 (3/12/18)
Ticket tracking the t2k DFC migration at Oxford. Kashif has supplied the best file dump that he can without DOME installed. Daniela has asked the VO if they can enact a "clean slate" solution at Oxford to make life easier for all. In progress (31/1)
131615 (3/11/17)
Oxford's IPv6 ticket. Kashif has kept this up to date, with some semi-positive news - things are moving in the right direction, however slowly. On Hold (7/1)
BRISTOL
139410 (30/1)
CMS ticket for transfer failures from Florida to the site. Investigation suggests that this might be an IPv6 issue. In progress (4/2)
131613 (3/11/17)
Bristol's IPv6 ticket. Good progress here, but more holes needed to be poked in the site's v6 firewall. We'll need to check the PS mesh (still all grey for Bristol's v6 endpoints at time of writing). In progress (4/2)
BIRMINGHAM
137801 (17/10/18)
Ticket tracking the decommissioning of the Birmingham DPM. The node was removed from gocdb and switched off last week. I can't remember how long these tickets need to be kept open - I should look that up really. Just remember to keep your logs for 90 days Mark! In progress (30/1)
138894 (17/12/18)
This ROD ticket for the decommissioned SE might have hit a problem - Mark removed the server from the gocdb but there's still an alarm on the dashboard... On Hold (9/1)
138244 (12/11/18)
Meanwhile since killing off the old DPM completely the Birmingham Availability/Reliability figures have started to fix themselves. On Hold (1/2)
131612 (3/11/17)
Birmingham's v6 ticket. Some good news just before Christmas, hopefully Mark will be able to start dual-stacking once he's cleared his plate a bit. On Hold (24/12/18)
GLASGOW
131611 (3/11/17)
Only the v6 ticket at Glasgow. Last update (today) was a request for info from the v6 ticket watchers. In progress (4/12/18)
EDINBURGH
139240 (21/1)
An LHCB ticket about jobs failing, tracked to a "black hole" node that was took offline. Last update was waiting on the VO to confirm if the problem has gone away, which they were having problems doing due to having "issues" at the time. If there's no word from LHCB soon then I would close this ticket. In progress (22/1)
138243 (12/1/18)
An availability ticket. I'm a little confused as to why there's still an alarm on the dashboard, as the argo page looks to my eyes like the site has had >85% availability over the last 30 days (only one non-100% day). On Hold (1/2)
131610 (3/11/17)
ECDF's v6 ticket. Some positive news back in early December, the ticket could do with an update. In progress (4/12/18)
DURHAM
131609 (3/11/17)
Another site with just the v6 ticket. Last update was the start of December, any news from your network team at all? On Hold (4/12/18)
SHEFFIELD
138649 (3/12/18)
Sheffield's t2k DFC migration ticket. The site's status is the same as Oxford, and was included in Daniela's query to t2k in that ticket. In progress (9/1)
131608 (3/11/17)
Sheffield's v6 ticket. In great need of an update. In progress (30/10)
MANCHESTER
131607 (3/11/17)
Only the v6 ticket at Manchester too. Things were looking good towards the end of last year, any news? In progress (27/11/18)
LIVERPOOL
139411 (30/1)
A request from Biomed querying if they still need to use the -s option to use the site's space token (note that they're still using lcg tools). John replied that currently this is still the case, but in the DOME future it won't be (due to quotatokens being applied to a directory). On Hold (1/2)
138648 (3/12/18)
Liverpool's t2k DFC migration ticket. Unlike the other two sites Liverpool is planning on migrating to DOME soonish, so they might not require a "clean slate solution". On Hold (18/12/18)
131606 (3/11/17)
Liverpool's v6 ticket. Last report had the networking team look at this in the New Year (so now-ish) to dual stack the storage, whilst the perfsonars are happily dual-stacked already. Please update the ticket once you know more (whoch will hopefully be soon-ish). In Progress (5/12/18)
LANCASTER
137996 (30/10/18)
A ROD ticket for an http test failure caused by DPM not quite handling http file moves quite right. Waiting on an updated version of DPM to get into epel - I will ask the devs today how that's going. On Hold (14/1)
UCL
139101 (8/1)
A ROD ticket for APEL publishing test failures. Ben has called Andrew McNab in for help installing things. In Progress (30/1)
RHUL
131603 (7/11/17)
Just the v6 ticket at RHUL too. Simon confirms that there's been no news on this front. In progress (23/1)
QMUL
139430 (30/1)
Another CMS ticket to update the site-local-config. Daniela has sorted it and has asked CMS to confirm. Waiting for reply (4/2)
139097 (7/1)
LHCB seeing data transfer problems, but this was a while ago. Dan has asked if problems persist. Waiting for reply (30/1)
138364 (19/11/18)
QM's t2k DFC migration ticket. Dan was ready to do the data moving bit, just asked for a confirmation of that needed to be done. Is the move underway Dan? In progress (16/1)
134573 (17/4/19)
CMS request to install singularity. Dan is rolling this into the move to C7, which was in the testing phase last November. Any recent news? On Hold (5/11/18)
IMPERIAL
139454 (31/1)
A ticket from a t2k user having trouble accessing post-DFC migration data at RALPP - which for reasons had to be routed to Imperial. Daniela can't spot any problems, so it looks like a user side issue. Although it might be worth checking the t2k.org .lsc files at RALPP. Assigned (should be something else) (31/1)
138359 (19/11/18)
Daniela runs such a tight ship at IC that she has to assign other issues to her site - this is the DFC migration master ticket. On Hold (22/1)
BRUNEL
139344 (28/1)
CMS transfer failures at Brunel. The storage is working fine, but it looks like some files aren't at Brunel that CMS things should be at Brunel, with no explanation of where they went. It's being investigated. In progress (4/2)
100IT still have ticket: 137306 (last update 16/1)
TIER 1
138361 (19/11/18)
The Tier 1's t2k DFC migration ticket. The ticket looks done with, just waiting on t2k to see if things are okay. That seems to be a little unclear, but that might be a VO side problem. In progress (31/1)
138665 (4/12/18)
The original mice LFC ticket, on hold whilst the above is sorted out.
139476 (1/2)
With the MICE LFC dead in the water this is the request for a dump to migrate to the DFC. In progress (4/2)
139306 (24/1)
A request from Duncan to upgrade the RAL perfsonar hosts (and fix some configs). In progress (29/1)
138891 (17/12)
A ROD availability ticket that looks a bit off - John thinks this is due to invalid tests being run and has opened a counter ticket: 139198 - from that the test in question is due to be removed this week. On Hold (16/1)
139477 (1/2)
A ROD ticket for a couple of sickly ARC CEs. One node is fixed, the other was already on the naughty step for having a high load (possibly from the A-REX slapd process), and it's being poked and prodded. In progress (4/2)
138500 (26/11/18)
CMS transfers from T2_PL_SWIERK failing. File transfer experts were about to be called in, and the ticket is now On Hold. Is it going to be a tough one to debug? On Hold (30/1)
138033 (1/11/18)
Atlas ticket for singuarlity job failures at RAL. Still lots of back and forth here, with great efforts from James and Alessandra. In progress (31/1)
139414 (30/1)
LHCB jobs seg faulting. It appears these errors all occurred on VMs, and now those VMs have passed on the errors have disappeared too. As there's no way to easily proceed (VM necromancy isn't a thing afaik) then it looks like this one can be closed. In progress (4/2)
Tuesday 29th January 2019, 10.00 GMT
36 Open UK Tickets today.
TIER 1
138665 (4/12/18)
My apologies for being a nag, but this MICE LFC ticket still hasn't had an update this side of Christmas. Could someone please take a look and update the ticket (or at least re-acknowledge the ticket's existence). In progress (12/12/18)
138500 (26/11/18)
This CMS transfer ticket is a little quiet, although I suspect that's due to a lot of conversation going on along other channels and work is ongoing on the issue. Are my suspicions correct? In progress (17/1)
QMUL
139097 (7/1)
In a similar nagging tone, no words have been added to this ticket, from either side (site or LHCB). Is the issue still an issue now that (I believe) the works at QM are finished? In progress (8/1)
ECDF
139240 (21/1)
A comment aimed at LHCB rather then the site - have you been able to check that the issue at hand (which looked to be a classic black hole node) has been dealt with? The last report from the VO mentioned there were other issues preventing seeing if things were solved. In progress (22/1)
BIRMINGHAM
137801 (17/10/18)
The aspirational switch off date for the Birmingham DPM was yesterday. How did that go Mark? Do you now feel like a huge weight is off your shoulders? In Progress (22/1)
LIVERPOOL
138943 (19/12/18)
Just in case the Liver lads haven't seen it, this LHCB transfer issue is no more and the ticket can be closed. In progress (28/1)
BRISTOL
131613 (3/11/17)
To keep with the positives, the Bristol IPv6 ticket looks to be almost finished with - firewall ports are open so we just need to see if PS tests run fine. Nice. In progress (29/1)
Monday 21st January 2019, 16.30 GMT
39 Open UK Tickets this week.
First a look at a few regular tickets:
TIER 1
138665 (4/12/18)
This MICE LFC ticket that was mentioned last week still could do with some attention, it still hasn't been updated since last year. It looks like a connection issue (and a bit of an odd one at that). In Progress (12/12/18)
RALPP
139222 (18/1)
A ROD ticket for webdav test failures. Chris has asked where to get some help with figuring out the error code seen when the test fails - the test description link in the ticket appears to be broken. In progress (21/1)
QMUL
139097 (7/1)
Any luck fixing these LHCB data transfer failures? In progress (8/1)
THE IPv6 TICKETS
OXFORD: 131615
Kashif provided a comprehensive update at the start of the month - it looks like some progress is soon going being made, although it looks like it will be a slow process due to the low priority of IPv6 with the Oxford networking people. On Hold (7/1)
BRISTOL: 131613
Things are looking positive at Bristol, just waiting on some holes in the site firewall for the perfsonar boxen. Any luck with that? In progress (21/12)
BIRMINGHAM: 131612
More positive news, the new central infrastructure is in place and so hopefully Mark can have a go at dual-stacking soon (I assume after he's killed off his DPM). On Hold (24/12/18)
GLASGOW: 131611
Sadly not so positive an update from Glasgow - Gareth explained how their perfsonar revealed v6 traffic issues when it was dual-stacked (which is its job), so they're waiting on this getting fixed. Luckily the usual sticking point of v6 reverse DNS isn't an issue. In progress (4/12)
ECDF: 131610
Rob explained in the last update how the physical migration of the site *didn't* break the v6 connectivity of the perfsonar and test DPM (yey!). Dualstacking the production storage was predicted to start around nowish (give or take a month I assume). Any recent news? Now worries if there's not though. In Progress (4/12/18)
DURHAM: 131609
Adam forwarded Duncan's information about the JISC Secondary DNS service to the Durham networking team - v6 packets can otherwise flow (just no DNS!). Any word back from them? On Hold (4/12)
SHEFFIELD: 131608
There was hope that the perfsonar box could be dualstacked in November, but I assume the usual end of the year rush happened. Any luck dual-stacking it this year? An update for this ticket would be great. In progress (30/10/18)
MANCHESTER: 131607
Manchester got a shiny new IPv6 range towards the end of last year. Any luck dual-stacking your storage yet? Any timeframe for doing so if you haven't got round to it yet? In progress (3/12/18)
LIVERPOOL: 131606
John gave a nice chunky update last month - the site stands ready to dualstack their storage, but just waiting on getting the WAN routing fixed (hopefully sometime soonish). But at least their perfsonar is happily v6'd. In Progress (5/12/18)
RHUL: 131603
At last check RHUL were waiting on v6 DNS before they could proceed. Simon reported that RHUL were looking at outsourcing this service to JANET, but no word on if/how well that's going/gone. Any news? An update would be appreciated. In progress (29/10)
Monday 14th January 2019, 14.30 GMT
40 Open UK Tickets this week.
T2K DFC Migration on DPMs
Liverpool: 138648
Oxford: 138647
Sheffield: 138649
Lancaster: 138365
A quick summing up of these tickets- to provide the information T2K need (namely adler32 checksums for files that don't already have them) it appears your DPM needs to be DOME'd. At Lancaster seem to be having the most luck with this so far so please feel free to prod me about it.
v6-looking transfer problems
Liverpool (lhcb): 138943 (19/12)
RALPP: (atlas): 139127 (10/11)
Whilst for different VOs there's a common theme to both of these tickets - it looks like the failing transfers are trying to use IPv6. Any thoughts? Update - both tickets have been looked at further, the Liverpool ticket was a firewall issue and should be fixed. Chris has looked into the RALPP errors and is a little confused as there don't seem to be any v6 routing problems but there are too many v6 transfer failures.
Bristol LHCB Ticket
138402 (21/11/18)
Are the issues described in this ticket still happening? That might be a question for the VO rather then the site. (6/12/18)
Last Year's Tier 1 Tickets:
138665 (LFC access issues)
138500 (CMS transfer failures)
138361 (T2K DFC migration)
A quick note that none of these tickets have had an update from the site yet this year to indicate that they've been picked up again after the Holiday break.
Extra Extra 139152 - This Sheffield LHCB ticket from the weekend seems to have been missed, it looks like there might be a black hole node gobbling up LHCB jobs.
Monday 14th January 2019, 14.30 GMT
40 Open UK Tickets this week.
T2K DFC Migration on DPMs
Liverpool: 138648
Oxford: 138647
Sheffield: 138649
Lancaster: 138365
A quick summing up of these tickets- to provide the information T2K need (namely adler32 checksums for files that don't already have them) it appears your DPM needs to be DOME'd. At Lancaster seem to be having the most luck with this so far so please feel free to prod me about it.
v6-looking transfer problems
Liverpool (lhcb): 138943 (19/12)
RALPP: (atlas): 139127 (10/11)
Whilst for different VOs there's a common theme to both of these tickets - it looks like the failing transfers are trying to use IPv6. Any thoughts? Update - both tickets have been looked at further, the Liverpool ticket was a firewall issue and should be fixed. Chris has looked into the RALPP errors and is a little confused as there don't seem to be any v6 routing problems but there are too many v6 transfer failures.
Bristol LHCB Ticket
138402 (21/11/18)
Are the issues described in this ticket still happening? That might be a question for the VO rather then the site. (6/12/18)
Last Year's Tier 1 Tickets:
138665 (LFC access issues)
138500 (CMS transfer failures)
138361 (T2K DFC migration)
A quick note that none of these tickets have had an update from the site yet this year to indicate that they've been picked up again after the Holiday break.
Extra Extra 139152 - This Sheffield LHCB ticket from the weekend seems to have been missed, it looks like there might be a black hole node gobbling up LHCB jobs.