Tuesday 23rd September
- Down to 19 tickets this week.
- The GGUS summary is available for review.
Tuesday 16th September 2014, 10.00 BST
We seem to once again be in the Dark Ages here at Lancaster, with yet another power outage that is overrunning. Hopefully I'll be at the meeting though, thanks to a huge laptop battery and a brave little eduroam wireless hub that somehow is still up and running. Sorry for the lack of email warning, and trampling over Jeremy's update!
26 Open UK tickets this week.
Regarding the mismatch between BDII and SRM storage numbers for ATLASHOTDISK at RAL. Maria asked a question about this last week, but no answer yet. In progress (3/9)
Not really a Tier 1 issue, Sno+ trying to allow some remote SUSE users access data. Henry has given some nice suggestions from his experience with MICE, using a java srm web gui thing. In progress (10/9)
100IT have been asked to set up the vmcatcher tool at their site, but have hit a snag at the first hurdle of creating an appdb account to allow them to download the images. Anyone have any experience with this? Looks like a counter ticket is needed, which the 100IT guys might not be confident in doing themselves. Waiting for reply (10/9)
Sno+ ticket about wanting to set software tags on ARC CEs. Ewan has replied with a comprehensive but not particularly positive (from Sno+'s point of view) post, but there's some hope that they can get what information they need from the VO nagios pages (which Kashif has got working for ARC CEs now as well). I don't think there's anything more that can be done here, but we might want to give Matt M a chance to reply. In progress (15/9)
Sheffield's perfsonar box playing up. Elena has tried to get it back on its NICs, but no joy. My advice is a reinstall or a mail to perfsonar support (or at least TB-SUPPORT). In progress (9/9)
Duncan ticketed QM's IPv6 test perfsonar about not initiating any tests in the Test Mesh. Glancing at the link it looks to me like this is no longer the case, but if I'm mistaken and there will be no progress for a while it'll be nice to on hold this ticket. In progress (8/9)
Pheno have been having a sort out of their storage in the UK (108334). The ECDF version of this ticket seems to have created some confusion, as the Edinburgh chaps don't support pheno on their storage, whilst Pheno are under the impression that they aren't supported at ECDF at all. In Progress (10/9)
Tuesday 16th September
- 23 open tickets this week.
- The GGUS summary is available for review.
- With reference to tickets discussed last week (OK. means done. Y. means continuing).
NO SITE IN PARTICULAR
As seen on TB-SUPPORT, the NGI has a ticket telling it to get sites to have the new voms servers configured for the switch over. Jeremy has kindly offered to field the ticket. I think we all have this in hand, but as I type this I realise I may have forgotten to set things up for the ops VO. I encourage everyone to double check their readiness ahead of next Monday's switchover. Assigned (8/9)
The RAL FTS2 service has been shutdown for nearly a week now, so I suspect this ticket tracking the switch off can be closed. In progress (3/9)
CMS having trouble running a "locateall" AAA test at RALPP (TBH I don't know what that is) - Chris has let them know that this is due to their xrootd reverse proxy being down, and it should be up and running in a day or two after it's reinstalled. In progress (8/9)
As mentioned last week, Sno+ have been having trouble as they can't assign software tags on Arc CEs, and they use these tags to do stuff like black/white listing. There was some dicussion on this in the ticket, but it fizzled out- I suspect due to the topic moving offline. Can it have an update please? In progress (27/8)
CMS transfer problems to Bristol. Winnie put an update, where she mentioned she has applied a fix to their Storm that might have fixed the problem. Maybe. She's asked if the problem still persists, as the monitoring links provided have all gone stale. Lukasz is on leave, can anyone CMS savvy help her? Waiting for reply (8/9)
CMS Pilots losing contact with home base. No progress since Winnie noticed that the problem only seems to affect one of the Bristol clusters, but none expected due to leave. On Hold (8/9)
Ok. Update - Bristol have another, possibly related CMS ticket 108317
Maarten ticketed ECDF about this CE's not having the new voms servers configured. Andy is working on it. There's a reminder that on top of adding the right configs services do need restarting. In progress (5/9)
glexec tarball ticket. There's a bit more movement on getting this done, but it's all on me to get the tarball glexec working still - naught the Edinburgh chaps can do.
Duncan noticed some interesting goings on on the Durham perfsonar page. The Durham chaps are talking to their networking team to figure out what the flip is going on. In progress (8/9)
Duncan's unwavering gaze also noticed a problem on Sheffield's perfsonar. Elena was tweaking it when it broke, and it looks like it's still broken, any luck fixing it Elena? In progress (26/8)
Liverpool got a ROD ticket when their CREAM CE got poorly. Steve worked his magic and things were fixed, but Gareth asks about the persisting BDII tests still failing. Solved (8/9) Update - the problems seems to have disappeared, so was probably just a artifact of BDII lag.
My personal shame number 1. Lancaster's poor perfsonar performance. Despite a reinstall of the box and not showing any signs of a bottle neck in transfers or running manual tests we still have really poor perfsonar results. No problems with the network have been found. Duncan helped formulate a plan at GridPP, but I haven't had the time to test it out yet. On hold (8/9)
My personal shame number 2 - Lancaster's glexec deployment ticket. Some news in that I have something I'd like to test now - I just need to find time to test it, then see if I can package it somehow. On hold (8/9)
UCL's glexec deployment ticket. This work was pushed back to the end of August - any news on it? On Hold (29/7)
A ROD ticket for UCL APEL publishing errors. The apel admins got involved and things are looking better now - although Gareth points out that there is some missing data in the Spring. In progress (8/9)
Pointing VO_SNOPLUS_SNOLAB_CA_SW_DIR to /cvmfs/snoplus.gridpp.ac.uk. No news for a while on this after it was acknowledged - has the job fallen to the bottom of the stack? In progress (22/8) Solved now, issue was dealt with last week but the ticket wasn't updated.
Duncan ticketed QM about one of their pefsonar boxen - which Dan pointed out is their IPv6 perfsonar. So does that mean this ticket can be closed? In progress (4/9) Update - Duncan would like the ticket kept open to track this node's assimmalation into the mesh.
Longstanding LHCB ticket with JET. No movement on this, but none was expected. Still if anyone wants to heroically interject with some ideas I'm sure it would be appreciated. On hold (29/7)
As mentioned last week, Matt M of Sno+ fame has a user who only has access to srm tools and is having trouble accessing files at RAL. Brian has suggested using the webfts, but Matt doesn't think this will work for the user's limited abilities. Any thoughts? In progress (8/9)
Inconsistency between BDII and SRM reported storage capacity...hang on, haven't we been here before (105571)? It's not quite the same problem, but it's close. Brian has confirmed the mismatch, Maria has asked for an explanation for it (and how it only really effects ATLASHOTDISK). In progress (3/9)
Checking the site firewall configuration for RAL's Vidyo router. Last update was in July, is the dialogue between the Vidyo team and the RAL networking chaps ongoing? On hold (1/7)
The Tier 1's version of 106325 - CMS pilots losing contact. This was waiting on the firewall expert getting back from hols to compare the settings between the Tier 1 and Tier 2 (who don't see this issue). Are they back yet? On Hold (14/8)