Past Ticket Bulletins 2014

From GridPP Wiki
Jump to: navigation, search

Monday 12th May 2014, 14.30 BST/

A mere 27 open tickets for the UK today.

NGI
https://ggus.eu/index.php?mode=ticket_info&ticket_id=101502 (24/2)
The ILC cvmfs ticket. Only Durham is left (I actually missed out Durham last few times I looked at this ticket). So it's all on you Durham chaps now. No pressure (except, there is a little bit). In Progress (7/5)

SUSSEX
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102810 (28/3)
Sussex's EMI3 upgrade ticket. Matt's fighting the good fight, and hopes to have it all sorted soon. Let us know if you need a hand Matt! In Progress (8/5)

RALPP
https://ggus.eu/index.php?mode=ticket_info&ticket_id=105290 (9/5)
The ROD has spotted Glue2 Validation errors on the RALPP bdii. Chris B spotted the ticket, but no news. In progress (9/5)

BRISTOL
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102205 (14/3)
Bristol's EMI3 ticket. Winnie has beaten the site-BDII into EMI3 shape and is visiting the same fate on their cream CEs and WNs, with one CE already converted and two more about to fall. Make sure you get the WNs too! On Hod (should really be In Progress) (12/5)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=105189 (6/5)
LHCB jobs having some trouble at Bristol, Winnie thinks it's some dodgey nodes at fault and is working on it. Waiting to see if failure continue. Waiting for Reply (7/5)

GLASGOW
https://ggus.eu/index.php?mode=ticket_info&ticket_id=101565 (26/2)
Publishing Max CPU time for LHCB. I believe that we've left it with LHCB asking that it be set to "a value that is obviously made up but isn't the default value" (although I could have the wrong end of the mace here). Been on hold for a while, so we probably want to make some kind of ruling. On Hold (8/4)

EDINBURGH
https://ggus.eu/index.php?mode=ticket_info&ticket_id=95303 (1/7/2013)
glexec ticket. No news here - sorry. On Hold (27/1)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=102201 (14/3)
The ECDF EMI3 upgrade ticket. Had some problems with a lingering ghost of their previous site-BDII, but hopefully time has exorcised that gremlin and the new EMI3 CE will be seen too, which just leaves one straggler to be dealt with. On Hold (should probably be In Progress) (9/5)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=105267 (8/5)
The other ECDF EMI3 upgrade ticket. Actually this only got submitted by Daniela to satisfy the dashboard demons, probably as you can't physically lift the ROD dashboard to throw it out of the window and shut it up that way. On Hold (12/5)

DURHAM
https://ggus.eu/index.php?mode=ticket_info&ticket_id=103722 (14/4)
Durham's EMI3 upgrade ticket. Daniela has extended the ticket to the zeroth hour. Let us know if you chaps get stuck on anything, but it looks like you have the upper hand. In Progress (2/5)

SHEFFIELD
https://ggus.eu/index.php?mode=ticket_info&ticket_id=105090 (2/5)
Sheffield had some CE nagios failures, but it looks like that storm has passed, with nothing but green as far as the eye can see on the nagios pages. Elena asks if she can close the ticket (i.e. has the alarm disappeared from the dashboard?). Waiting for reply (12/5)

LIVERPOOL
https://ggus.eu/index.php?mode=ticket_info&ticket_id=105299 (9/5)
Liverpool also have received a ROD ticket, this time of the Glue2 validation variety. Steve has set it in progress. In Progress (9/5)

LANCASTER
https://ggus.eu/index.php?mode=ticket_info&ticket_id=95299 (1/7/2013)
Lancaster's glexec ticket. No news I'm afraid. On Hold (4/4)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=100566 (27/1)
Lancaster's PerfSonar sucking. Duncan has suggested a reinstall, and noticed spikes of goodness. A reinstall has been put on the todo list. On Hold (12/5)

UCL
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102193 (14/3)
UCL's EMI3 upgrade ticket. Quiet, but Ben had scheduled the date for the upgrade as the 13th. Hopefully we'll hear positive news from him shortly. On Hold (30/4).

https://ggus.eu/index.php?mode=ticket_info&ticket_id=101285 (16/2)
UCL's perfsonar host carking it. And last work Ben had brought it back from the great beyond and hoped to have a reinstall done on the 30/4. No word since though. On Hold (28/4)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=95298 (1/7/13)
UCL's glexec ticket. Ben mentions a new chap being deputised, and that this will likely have to wait until then. On Hold (16/4)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=104824 (22/4)
Nagios ticket due to low site availability, caused by a period of outdated CA RPMs. Just waiting for the numbers to pick up again. In progress (6/5)

QMUL
https://ggus.eu/index.php?mode=ticket_info&ticket_id=103028 (6/4)
A much talked about (and right so) atlas ticket, about job failures at QM essentially due to atlas jobs not requesting the right amount of RAM. There's a question from atlas "if all the questions have been answered". Have they? In Progress (8/5)

BRUNEL
https://ggus.eu/index.php?mode=ticket_info&ticket_id=105324 (12/5)
Brunel are having some bother with their APEL publishing, it looks like there's a lot of missing data. In progress (12/5)

TIER 1
https://ggus.eu/index.php?mode=ticket_info&ticket_id=105161 (5/5)
Hone noticed their jobs in the ready status for a long time whilst submitted through the RAL WMSeses. Catalin has been engaging with Alexander to debug the issue. Waiting for reply (12/5)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=105100 (2/5)
CMS have embarked on their next Storage Consistency Check. Andrew closed the ticket after providing the desired information, but CMS have reopened (wanting to keep the ticket to track the SCC). Reopened (needs to be put In Progress or On Hold) (12/5)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=98249 (21/10/13)
cvmfs for Sno+. Things have picked up pace on this ticket, with Matt M ready to kick off the uploading the Sno+ tarball. Catalin has tweaked the web access to allow him to do so. Waiting for reply (12/5)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=105308 (11/5)
Atlas MCORE jobs failing with "Failed to open shared memory object: Permission denied". RAL team are looking at it. In progress (12/5)

EFDA-JET
https://ggus.eu/index.php?mode=ticket_info&ticket_id=97485 (21/9/2013)
Longstanding LHCB authentication problem at JET. The Jet admins have exhausted all their ideas, and have asked for any help. As the problem survived the upgrades to SL6 and EMI3 it's probably something specific with their setup. On Hold (25/4)

Tuesday 6th May

IOU one full ticket review - Matt.

EMI3 upgrade: Down to four EMI upgrade tickets - well done everyone. ECDF, Sussex, Bristol and UCL left to go. Things look good on the UCL and Edinburgh fronts. Bristol and UCL aren't progressing as well, but are adamant that they'll make the deadline. The ECDF ticket triggered a ticket to the ROD, but that's being sorted.

CVMFS for ILC: How are things going at Oxford and Glasgow for rolling out cvmfs for ilc (https://ggus.eu/index.php?mode=ticket_info&ticket_id=101502)? IIRC Glasgow were just waiting for the changes to gently percolate, Oxford were waiting for Kashif "The Puppet Master" to return to work his magic. That just leaves Bristol unaccounted for.

QMUL: https://ggus.eu/index.php?mode=ticket_info&ticket_id=103028
This atlas ticket, essentially regarding atlas jobs using more resources then they said they would (and thus being killed) has seen a lot of discussion at the Thursday atlas meeting. I thought I'd mention it in case anyone outside that meeting wants to weigh in.

Afraid that's all folks!

Monday 28th April 2014, 16.30 BST
I'm afraid the ticket roundup is incredibly light and not in the usual (or any) format.

EMI upgrade tickets: ECDF, Bristol, RHUL, Durham, EFDA-JET, Glasgow, Sussex, UCL and RALPP all have open EMI upgrade tickets. Can everyone with an open ticket please update it this week (preferably buy the first) if they haven't done so in the last 7 days (or if you have but have made progress since then). It's a lot easier for the Person on Duty to extend tickets when there's site updates to validate their actions.

(RALPP have submitted https://ggus.eu/index.php?mode=ticket_info&ticket_id=104839 in response to an argus problem they were seeing post upgrade).

UCL have another Nagios error ticket: https://ggus.eu/index.php?mode=ticket_info&ticket_id=104824

Interesting One: https://ggus.eu/index.php?mode=ticket_info&ticket_id=104937 Manchester received a ticket from Steve Traylen regarding a lot of connections to the CVMFS stratum 1. Andrew confirms these are VAC machines (unless I've misread something). It looks like the local squid cache was being ignored, Andrew is on the case.

Afraid that's it from me. Next week's will be better (because it's the first Monday of the month... that came around quickly!).

Monday 14th April 2014, 15.30 BST
No ticket update from Matt next week.

33 Open UK tickets today.

NGI (No Geezers In-particular in this case)
https://ggus.eu/index.php?mode=ticket_info&ticket_id=101502 (24/2)
ILC cvmfs ticket, No change since last week really, after tomorrows meeting I'll on hold this ticket until I'm back next week. In progress (3/4)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=103043 (7/4)
Tom's ticket requesting cern@school access to the IC Dirac server. It's all done, the ticket just needs closing (and whilst I'm happy to stick my nose into tickets I won't close or reopen them). Assigned(!) (7/4)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=103197 (9/4)
Chris W has spotted several instances where the old myproxy server shows up in the online documentation. Andrew has tried to edit https://www.gridpp.ac.uk/deployment/users/myproxy.html but can't get access - Daniela suggested asking the hosting site but maybe Tom has access? Waiting for Reply (9/4)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=98249 (21/10/2013)
The Sno+ CVMFS ticket. Could some of the progress mentioned last week please be put into the ticket? In progress (26/3)

QMUL
https://ggus.eu/index.php?mode=ticket_info&ticket_id=103028 (6/4)
Chris ran these atlas job failures down and discovered they were due to the jobs going over their memory quotas. What I didn't like the looks of was how it the jobs themselves requesting these amounts of memory. Atlas says can be solved, but something to watch out for. In progress (11/4)

GLASGOW
https://ggus.eu/index.php?mode=ticket_info&ticket_id=101565 (26/2)
As mentioned last week, LHCB have got back to Glasgow deciding that MaxCPUTime needs to be set to something, Sam respectfully maintains his stance. Steve B links a interesting ticket to the cream devs: https://ggus.eu/index.php?mode=ticket_info&ticket_id=97721 On Hold (8/4)

"EMI UPGRADE" tickets.

TIER 1
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102611
Kashif points out that the NGI argus isn't in the site bdii, which is the probably cause of the test failures. The other two problem servers are due to be decommissioned, so all good here. In progress (14/4)

DURHAM
https://ggus.eu/index.php?mode=ticket_info&ticket_id=103722 (14/4)
A very fresh alarm ticket for Durham's CE and SE. Sorry you guys have to do this dance again! Assigned (14/4)

EDINBURGH
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102201 (14/3)
Andy notes that the links to the alarms given in the ticket appear to be broken. How gos the upgrade in general? On Hold (7/4)

RHUL
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102189 (14/3)
I think RHUL just has some CEs to upgrade, have you done the site BDII? The list of services that need to be upgraded isn't exhaustive. On hold (21/3)

SUSSEX
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102810 (28/3)
You guys put in a good plan, did it survive contact with the enemy? In progress (1/4)

GLASGOW
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102202 (14/3)
The Glasgow list of services to upgrade was long, but that's just a reflection of how much stuff they run. Gareth gave a good update last week, so there's naught to worry about here (hopefully I didn't just curse you...). In Progress (8/4)

BRISTOL
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102205 (14/3)
Winnie sounded confident that upgrade will be done by the end of April (and we aren't halfway though the month yet). In progress (4/4)

UCL
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102193 (14/3)
Ben set a reminder date for the 31st of March, no news since then. On hold (14/3)

EFDA-JET
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102166 (14/3)
It's just the Jet DPM that looks like it needs upgrading. If they've kept it up to date then this upgrade is trivial. Hope to be done by the end of April. On hold (24/3)

Monday 7th April 2014, 13.30 BST

32 Open UK tickets this week.

No site in particular.
https://ggus.eu/index.php?mode=ticket_info&ticket_id=101502 (24/2)
The ILC cvmfs rollout ticket. Glasgow, Oxford, Durham and Bristol were missing at last head count - although Glasgow are mid-rollout and should be fully deployed any day now (if not already). I think Oxford are in a similar boat? As JK points out, we've got to the point where probably need to on hold the ticket whilst I harass the last few stragglers. In progress (3/4)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=103043 (7/4)
Squire Whyntie has asked for cern@school registration on the Imperial Dirac. Janusz has done so and Tom confirmed it works and cab be solved. If only all things were solved so quickly! Assigned (7/4)

SUSSEX
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102810 (28/3)
The new Sussex EMI2 upgrade ticket. Matt RB copied the Sussex plan over from the original ticket. Daniela cleared up the mystery of what happened to the original ticket (dashboard shenanigans) and posted some useful instructions for the BDII upgrade. In progress (1/4)

RALPP
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102990 (3/4)
Duncan's unending perfsonar vigilance discovered a a problem with the RALPP latency box. Ian reports firewall problems that have been solved, so it looks like this one can be closed (if all is well). In progress (can be closed) (4/4) Not quite out of the woods yet after all, Ian spotted and fixed a few more problems, Duncan has spotted something else away.

https://ggus.eu/index.php?mode=ticket_info&ticket_id=102953 (24/3)
CMS glidein hammercloud jobs not running at the site (specifically their defunct cream CEs)- Chris points out another ticket (https://ggus.eu/index.php?mode=ticket_info&ticket_id=102915) essentially detailing the same problem (just for different job types). Probably worth on holding this one whilst waiting on the other, as it looks like the problems are CMS side. In progress (2/4)

OXFORD
https://ggus.eu/index.php?mode=ticket_info&ticket_id=103027 (5/4)
LHCB pilots aborting, Kashif asks if the problem persists, the ticket fairy set the ticket to Waiting for Reply (5/4) SOLVED

https://ggus.eu/index.php?mode=ticket_info&ticket_id=102469 (19/3)
cvmfs for t2k. I think this has fallen through some cracks, no word for a while. In progress (21/3)

BRISTOL
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102205 (14/3)
Bristol's EMI2 upgrade ticket. Not much news, although there was a positive update from Winnie that looks like the April deadline will be made. In progress (4/4)

GLASGOW
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102914 (1/4)
An atlas ticket, detailing some odd transfer behaviour for some files, likely attributed to some off tcp window settings on a disk server. There was a similar looking (although possibly not identical) problem at RHUL (https://ggus.eu/index.php?mode=ticket_info&ticket_id=102311). Some interesting stuff. In progress (4/4) Sam updated the ticket, with no more "sub-optimally tuned" disk pools. I think it should be set to waiting for reply though

https://ggus.eu/index.php?mode=ticket_info&ticket_id=102202 (14/3)
Not as interesting, Glasgow's EMI upgrade ticket. Chugging along, last word was from David a little while back about having watching some atlas canary jobs running on the EMI3 worker nodes. How did these pan out? In progress (27/3) Gareth updates that progress is slow but steady, draining nodes is taking a while.

https://ggus.eu/index.php?mode=ticket_info&ticket_id=101565 (26/2)
LHCB asked Glasgow to publish their max CPU time. Not wanting to be made liars of, Sam pointed out why they didn't (shouldn't) do this. This has seemed to send LHCB back to the drawing board, so the ticket is on hold. On Hold (12/3)

EDINBURGH
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102201 (14/3)
The ECDF EMI upgrade ticket. Not much to report here, although the apel box through a wobbly as well, Andy's on it. In progress (2/4)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=95303 (1/7/13)
glexec ticket. Word on that later. On Hold (27/1)

DURHAM
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102199 (14/3)
Another EMI upgrade deadline ticket. A plan is in place and the work is underway. On Hold (24/3)

SHEFFIELD
https://ggus.eu/index.php?mode=ticket_info&ticket_id=100037 (3/1)
Sheffield's perfsonar having trouble. Elena upgraded and got the Sheffield IT guys to open port 8086 - it looks like she's nailed the problem and has asked for confirmation. Waiting for reply (7/4) (And before I even finished the review, the ticket was solved).

LANCASTER
https://ggus.eu/index.php?mode=ticket_info&ticket_id=95299 (1/7)
GLEXEC ticket. The tarball glexec isn't going well (no thanks to EMI3 taking up the last 6 weeks of tarball time). I might have to admit defeat (but will ask the devs for help before I do). On hold (4/4)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=100566 (27/1/13)
Lancaster's Poo Perfsonar Performance (I said I wouldn't use that alliteration again, I lied). Using "normal" iperf to probe the boxes I see no 1Gb bottlenecks in my network, could be problem be software? On hold (7/4)

UCL
https://ggus.eu/index.php?mode=ticket_info&ticket_id=101285 (16/2)
UCL's perfsonar also having difficulty, although their difficulty is caused by the hardware going kaput on them. Ben is chasing up Dell for new bits. In progress (3/4)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=102193 (14/3)
EMI upgrade ticket. Ben put in a brief plan, but the reminder date has passed. How goes it? The bdii and DPM are fairly straightforward to upgrade. On hold (14/3)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=95298 (1/7/13)
GlexeC ticket. No news for a while, is this work to be rolled into the EMI3 upgrade? On hold (27/1)

RHUL
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102189 (14/3)
RHUL's EMI upgrade ticket. Not much news here. On hold (21/3)

QMUL
https://ggus.eu/index.php?mode=ticket_info&ticket_id=103028 (6/4)
Atlas seeing production jobs failing due to pilot errors. Chris asked if production job options have changed recently? The ticket fairy struck again, setting the ticket to Waiting for reply (although he's less sure if that was the intention of Chris' reply). In progress (7/4) Atlas replied saying that they don't think there has been any job changes. Full prod disk is making things even cloudier, but Dan has asked for clarification on what an error message actually means - "!!FAILED!!1999!! Job killed by signal 24: Signal handler has set job result to FAILED, ec = 1204"

https://ggus.eu/index.php?mode=ticket_info&ticket_id=101639 (26/2)
RFC3820 proxy problems at QM (and elsewhere). JK has asked the submitter for his ticket intentions. Set to Waiting for reply by our friend, the ticket fairy. (1/4)

(Please remember to set your tickets to Waiting for Reply after asking a question to the submitter. Don't make me spend yet another Monday afternoon referring to myself as the ticket fairy.)

IMPERIAL
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102888 (1/4)
Biomed asked for access to their cvmfs repo to be rolled out at IC. Daniela has said fine but asked that they completely migrate to it within 3 months (nfs or cvmfs). Daniela has completed the rollout and asked biomed to test. Waiting for reply (7/4) Biomed have got back saying that they've launched some test jobs, but expect it might take a while for them to run. I think they also were kinda asking if Imperial would give them some leyway on moving wholey to cvmfs.

EFDA-JET
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102166 (14/3)
The JET EMI upgrade ticket. There was a hope to upgrade before the end of April. On hold (24/3)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=97485 (21/9/13)
SSL type errors for LHCB at JET. No progress on this for a while, the problem somehow survived the move to SL6/EMI3. On hold (11/2)

TIER 1
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102611 (24/3)
The Tier 1 EMI upgrade ticket. There seem to be some false positives on the list, which could do with clarification which these are (especially due to the dashboard noise on the ticket). In progress (27/3)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=98249 (21/10/13)
CVMFS for SNO+. Matt reported that the collaboration has given permission to have their software on cvmfs, and hoped to have tarballs ready for last week. Has there been any progress offline? In progress (26/3) Update - Squire Whyntie informed me that this is being actively worked on offline, with Tom kindly providing assisstance.

https://ggus.eu/index.php?mode=ticket_info&ticket_id=101079 (9/2)
ARC CEs publishing the wrong DefaultSE. Andrew has hacking this on his todo list, but bumped this issue down the list (which is fine, as it's low priority) . In progress (1/4)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=99556 (6/12/13)
The NGI argus ticket. I'm pretty sure that this can be closed, as argusngi.gridpp.rl.ac.uk is setup and tested by several sites- so all looks well here. On Hold (21/3)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=101968 (11/3)
Atlas deletion errors at the Tier 1. The problem is known, but not well understood, and sadly persists (last set of errors reported on the 4th). Alastair has put in a good explanation of the symptoms. On hold (4/4)

Monday 31st March 2014, 15.00 BST
34 Open UK tickets this week.

TIER 1
https://ggus.eu/index.php?mode=ticket_info&ticket_id=101968 (11/3)
Atlas deletion errors at the Tier 1. Alastair posted a good explanation of the problem and some mitigation details, but atlas would like an update. On hold (12/3) Update - Problem persists, reminder set for 7/4

https://ggus.eu/index.php?mode=ticket_info&ticket_id=102611 (24/3)
The Tier 1's EMI upgrade ticket. Some false positives on this list, Kashif asks if the NGI argus is also a false alarm? In progress (28/3)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=101079 (9/2)
Tweaking the ARCCE DefaultSE publishing. As a bit of bookkeeping can the priority be tweaked to less urgent (seeing as the issue isn't causing great woe). On hold (17/3)

As an aside tickets often are submitted using the default priority of "urgent" and category of "Incident" - if you catch these in your tickets then you should feel free to change them.

SUSSEX
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102810 (28/3)
Sussex's original EMI upgrade ticket (102212) was closed "automatically"- ("broken ticket - close by Operations Portal"), leaving this one in it's stead. I'm not sure if the information Matt RB carefully posted in the previous ticket needs to be cut and pasted over to here. All seems a bit weird. In progress (28/3)

OXFORD
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102469 (19/3)
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102544 (21/3) Solved
A couple of Oxford tickets look a bit neglected (one about cvmfs for T2K, t'other an lhcb/torque problem). I suspect these got overlooked with the excitement of Pitlochry last week. In progress (21/3)

(Also there's ticket https://ggus.eu/index.php?mode=ticket_info&ticket_id=102740, which could be seen as either an annoyingly finicky request or the epitome of a low hanging fruit, for when you *really* need a win that day!). Also Solved.

SHEFFIELD
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102489 (20/3)
Similarly at Sheffield, maybe this biomed "invalid publishing" ticket got forgotten about on the trip to sunny Scotland. In progress (20/3)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=100037 (3/1)
The Sheffield perfsonar ticket. Things just needed finishing off by the looks of it - let us know if any advice is needed. On hold (11/3)

BIRMINGHAM
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102404 (18/3)
Birmingham's perfsonar "being weird" (ignoring Bristol), although Matt fixed it. Just doing the post-game roundup to figure out what magic actually fixed things, but could do with an update in the ticket. In progress (20/3) Update-Solved

QMUL
https://ggus.eu/index.php?mode=ticket_info&ticket_id=101639 (26/2)
RFC3820 proxy problems. The problem is spread wider then QM, and likely needs a middleware patch or three to solve. Dan and Chris have asked for a master ticket to be created (failing that some more information would be nice). Nothing forthcoming from the submitter yet. I think this ticket has mutated to include the issues from RAL as well as QM. A bit of a mess. In progress (18/3)


Monday 17th March 2014, 14.00 GMT
47 Open UK tickets this week, a dozen of them are EMI2 retirement tickets so they'll get the lion's share of our attention.

NGI
https://ggus.eu/index.php?mode=ticket_info&ticket_id=101502 (24/2)
The ILC software area move ticket. IC, RAL, QMUL, Cambridge and Liverpool have moved. Lancaster moved but is (hopefully was) broken for ILC (pardon my noise). Assuming that anyone not mentioned on the ticket hasn't migrated the ILC SW_DIR yet that leaves the following list of uk sites to migrate their ILC software area:
OXFORD
GLASGOW
BRISTOL
RALPP
BIRMINGHAM
BRISTOL
BRUNEL - Moved but hadn't updated the ticket
RHUL
DURHAM
MANCHESTER
(I might have missed some of you out, this list is from lcg-infosites and grepped using my admittedly poor eyeballs. A braver man would have crafted his own ldapsearch to glean this info).
If you don't want to make the change then removing support for ILC is a viable course of action, if you have made the change please update the ticket to let ILC know. In progress (17/3)

EMI ALARMS
Remember that you need to have *at least* your upgrade plans in these tickets within a fortnight of the ticket's submission - so by the 28th of March.

SUSSEX
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102212 (14/3)
BDII is the culprit here, ticket acknowledged but no other news (or plan). In Progress (17/3)

RALPP
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102207 (14/3)
ARC CE, a few CREAMs, site BDII and WNs - Chris reports that this is probably a false alarm, but is looking into it (in case the publishing is off). Chris has included a plan for the other components (if I'm reading right, is RALPP ditching all CREAMs?). In progress (14/3)

BRISTOL
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102205 (14/3)
site BDII (seems to be a common one), some CEs and the WNs. Winnie has posted an assurance that the upgrade will be done in time, but I'm not sure if that'll count as a plan to the powers that be. In progress (17/3)

GLASGOW
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102202 (14/3)
Lots of services, but Dave has given a detailed upgrade battleplan. In progress (17/3)

EDINBURGH
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102201 (14/3)
I think site-BDII and WNs. Wahid has given assurances, but (sorry to be a pedantic patsy) not sure if that'll count as a "upgrade plan". In other news the testing for the SL6 WN tarball is going well. In progress (14/3)

DURHAM
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102199 (14/3)
The Durham DPM, CE and site BDII are on the list. Assigned (14/3)

SHEFFIELD
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102197 (14/3)
Some CEs and the APEL box (that's a guess). Elena has given a good plan, there's some hassle as their APEL and BDII box are shared. In progress (14/3)

UCL
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102193 (14/3)
DPM, BDII, CE and WNs. Ben has said he will upgrade in the next few weeks. On hold (14/3)

RHUL
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102189 (14/3)
A CREAM and the site BDII. Govind is planning his upgrade plans. In progress (14/3)

IMPERIAL
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102185 (14/3)
Just some WNs. Engaged in testing and plan to upgrade the last cluster this week. In progress (14/3)

BRUNEL
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102184 (14/3)
Just the DPM I think. Henry remarks that they're just about to embark on a physical server move and doesn't want to change anything significant before the move. I gave my recipe for the EMI3 dpm move in case it helps. In progress (14/3)

EFDA-JET
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102166 (14/3)
Again just the DPM. Acknowledged, but no plan. In progress (17/3)

I'm actually pretty sure Lancaster should have got a ticket as we still have one cluster on the EMI2 tarball. I'm not going to complain though.


NORMAL TICKETS

QMUL
https://ggus.eu/index.php?mode=ticket_info&ticket_id=101639 (26/2)
This ticket about jobs using RFC3820 style proxies not working at QM is in an odd state. The user seemed to be confused as to what feedback he should give. In progress (17/3)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=101916 (8/3)
Sorry to be picking on QM, but this 444444 publishing jobs ticket is looking neglected, I suspect you've been frying bigger fish but can you please show it (or even better, the underlaying issue!) some love. Is this linked to your other information publishing problems? In progress (10/7)

UCL
https://ggus.eu/index.php?mode=ticket_info&ticket_id=101285 (16/2)
UCL Perfsonar ticket. After having his perfsonar box whacked by a power outage Ben is reinstalling, but is seeing some odd hardware issues. Has anyone else seen their R610s PERCs play up like this (only showing 3G partitions)? In progress (12/3)

MANCHESTER
https://ggus.eu/index.php?mode=ticket_info&ticket_id=102394 (18/3)
Just in, but similar to the ILC ticket - Catalin has asked Manchester to deploy cvmfs for t2k. In progress (18/3)

As always please pipe up if I've missed anything or if there's any other ticket related issues you want to bring up.


Monday 10th March, 13.00 GMT</br> Only 28 Open UK tickets this week.

NGI
https://ggus.eu/index.php?mode=ticket_info&ticket_id=101502 (24/2)</br> ILC moving to cvmfs for their software area. As Jeremy mentioned after tomorrow we're going to start chasing sites that support ILC but haven't rolled out these changes. 4 sites have implemented the move and passed muster. A tip from me is to remember to update the software area entry in your CE's info system for ILC as well as on the nodes. In progress (10/3)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=101820 (5/3)</br> This goc db ticket ended up assigned to the UK. I've punted it in the direction of the GOC DB support unit. Assigned (10/3)

EDINBURGH</br> https://ggus.eu/index.php?mode=ticket_info&ticket_id=100569 (28/1)</br> Wahid has got stuck trying to reinstall his perfsonar box, if I'm reading it right the reinstall from the netimage isn't "taking". Has anyone seen this before or have any tips? Waiting for reply (10/3)

GLASGOW</br> https://ggus.eu/index.php?mode=ticket_info&ticket_id=101565 (26/2)</br> LHCB wanting MaxCPUTime to be published. Sam has eloquently explained his point about why he doesn't want to set this, I fear that some kind of impasse has been reached, and I'm not sure where to go on this issue. In progress (4/3)

PERFSONAR</br> https://ggus.eu/index.php?mode=ticket_info&ticket_id=101136 (RALPP)</br> https://ggus.eu/index.php?mode=ticket_info&ticket_id=100037 (SHEFFIELD)</br> Any news on upgrading the perfsonar instances at RALPP or SHEFFIELD? Reminder dates on these tickets have passed by a week now.


That's all my addled brain can process I'm afraid, can sites please check the link below (oh, and yippie for GGUS search bringing back ordering by site again):</br> http://tinyurl.com/p37ey64

Monday 3rd March 2014, 14.30 GMT</br> 44 Open UK NGI tickets this week.

NGI</br> https://ggus.eu/index.php?mode=ticket_info&ticket_id=101502 (24/2)</br> ILC moving to cvmfs, so those of us seekign to continue support will need to enable it. IC and Cambridge have already moved and been confirmed working. It might be easier if we collate any other sites who have moved into a single list to give to ILC. The working plan is to open tickets against sites who haven't moved after giving them a suitable grace period. In progress (26/2)

TIER 1</br> https://ggus.eu/index.php?mode=ticket_info&ticket_id=99556 (6/12/13)</br> The NGI Argus ticket. There's been great progress on this, can we reflect some of this in the ticket? Or perhaps close it if we're satisfied. In progress (13/2)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=101491 (23/2)</br> The RAL perfsonar latency box is being troublesome. It crashed and was brought back up again, but has crashed again so Duncan has reopened the ticket. Reopened (3/3)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=101716 (28/2)</br> This cms transfer ticket has INFN as the "notified site", surely it should be RAL-LCG2 instead? I didn't change it myself in case I missed some nuance. Transfer problems appear to be linked to the virtualisation problems RAL have been experiencing affecting FTS3. In progress (3/3)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=101729 (1/3)</br> LHCB pilots failing on a RAL CE. Being looked into. In Progress (3/3)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=101701 (28/2)</br> ILC having troubles with the RAL ARC CEs. Looks to be a user group for ilc (production) missing. In progress (28/2)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=101052 (6/2)</br> Biomed having trouble retrieving results from RAL cream CEs. Tracked down to the RAL EMI2 argus not handling Rfc proxies. An update to EMI3 is hoped to fix this, although Dan reports that this isn't the case at QM (see 101639). In progress (27/2)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=101532 (25/2)</br> LHCB noting that RAL is publishing the default MaxCPUtime. Fixed but Orlin notes some caching behaviour. Maria AP chimed in that you might have a buffy bdii version in the chain. In progress (26/2)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=100114 (8/1)</br> Chris W's ticket concerning jobs failing to get from RAL to Imperial. Catalin asked for some testing, but Chris has been on busy. The ticket hit its second reminder though. Waiting for reply (11/2)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=97025 (3/9/13)</br> Longstanding myproxy issue. Andrew reports that the new myproxy service is up and running, so I assume this ticket can be closed soon? Or at least put back in progress. On hold (25/2)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=101079 (9/2)</br> ARC CEs having a default SE of 0 and not being able to tune this per VO. Andrew is figuring out a fix to this. In progress (25/2)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=98249 (21/10/13)</br> cvmfs for Sno+. Ticket on hold whilst tarballs are created. Been that way for a while. On hold (29/1).

EDINBURGH</br> https://ggus.eu/index.php?mode=ticket_info&ticket_id=100569 (28/1)</br> ECDF's perfsonar box refusing MA connections. Wahid has rebooted the box but no joy, Duncan linked some instructions as requested. In progress (3/3)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=99794 (16/12/13)</br> Access to the ECDF perfsonar pages. There's a big ACL overhaul going on at the moment, Andy apologises and will chase the central IT chaps about it. On hold (28/2)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=101659 (27/2)</br> 44444 jobs publishing on some ECDF CEs (as part of information system cleanup campaign). These CEs are due for retirement (replicant style) today, so this and the related tickets will be done with soon. In progress (3/3)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=100840 (29/1)</br> Apel-Pub nagios test failures at ECDF. The guys are working on it, but sadly the ticket is escalating. Daniela posted a note that if you have a support ticket with APEL open (which I think is advisable) to link that into this ticket. In progress (3/3)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=95303 (1/7/13)</br> glexec deployment ticket. The ECDF lads are waiting on the tarball (i.e. me). Still. On hold (27/1)

RALPP</br> https://ggus.eu/index.php?mode=ticket_info&ticket_id=101726 (1/3)</br> LHCB ticket about the default CPU time (999999) being published at RALPP. I thought that RALPP had solved something like this recently, but maybe I dreamt it? Assigned (1/3) Update - Solved, something was being published that shouldn't be any more.

https://ggus.eu/index.php?mode=ticket_info&ticket_id=101727 (1/3)</br> Info system cleanup campaign, 4444444 job at RALPP. Assigned (1/3)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=101398 (19/2)</br> LHCB would like xrootd holes poked in the RALPP firewall. As mentioned last week I believe this requires holes poked in the RAL firewall, which is undergoing an overhaul. This ticket could do with some attention mentioning these problems, and possible on holding. In progress (19/2)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=101136 (11/2)</br> Request to upgrade the RALPP perfsonar to the latest version. Due to a lack of hands on deck Chris postponed this work, with a reminder date of today. On hold (21/2)

IMPERIAL</br> https://ggus.eu/index.php?mode=ticket_info&ticket_id=101367 (18/2)</br> A cms user having trouble srmcping in his jobs at IC. Looks to be a java 1.7 mismatch problem. Simon has asked some questions, no answer yet (user has set notify to "on solution" so might not have got the update). Waiting for reply (24/2)

DURHAM</br> https://ggus.eu/index.php?mode=ticket_info&ticket_id=101752 (3/3)</br> LHCB jobs having problems at Durham. Ewan S. has asked if the problems persist. Waiting for reply (3/3)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=101763 (3/3)</br> Part of the campaign to clean up the information system, Durham have been asked to update their BDIIs (site and resource) to not-buggy versions. Assigned (3/3)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=101177 (12/2)</br> Durham trying to wash the biomed out of their SE's information system. No joy yet. I advise asking at the storage meeting if stuck. In progress (26/2)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=99621 (10/12/13)</br> enmr noticed a bad WN, which was promptly quarantined. It hasn't been fixed, but I maintain that the problem itself is contained and solved if you want to close the ticket... On hold (28/1)

GLASGOW</br> https://ggus.eu/index.php?mode=ticket_info&ticket_id=101710 (28/2)</br> Nagios SRM-Put test failures. The problem is known (it's DPM being odd with its space reporting whilst a pool is readonly -Sam describes it better). In progress (28/2)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=101565 (26/2)</br> LHCB sees that Glasgow is also publishing default max CPU time for some (all? one?) of their queues. Sam points out that this is on purpose (due in part to multicore jobs, jobs are limited by Wall time only), and asks if LHCB can't make educated guesses. Stefen replies with a point about the difference in "MaxCPUTime" and "MaxTotalCPUTime", but I'm not sure that covers the Glasgow concerns. Worth discussing to get a UK stance on this. In progress (3/3)

BRUNEL</br> https://ggus.eu/index.php?mode=ticket_info&ticket_id=100568 (28/1)</br> Perfsonar MA problem. Raul has been working steadily at this and it looks to be progressing nicely. In progress (28/2)

QMUL</br> https://ggus.eu/index.php?mode=ticket_info&ticket_id=101676 (27/2)</br> One of QM's perfsonar boxes is having problems, missing services. Likely to be caused by running a bleeding edge version of perfsonar. In progress (27/2)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=101682 (27/2)</br> Brian has asked for a SE dump of QM atlas files. Assigned (27/2)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=101557 (25/2)</br> Matt from SNO+ having trouble on a QM UI, delegating proxies to the FTS. The same works on lxplus though. This ticket needs a home, but there's an argument that it isn't a site problem (as a UI isn't necessarily part of a site). Assigned (26/2)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=94746 (10/6/13)</br> Biomed haunting the QM SE's info system. I believe Chris is waiting on his changes to seep into the Storm release (100290). On hold (14/1)

BRISTOL</br> https://ggus.eu/index.php?mode=ticket_info&ticket_id=101669 (27/2)</br> lhcb ticketed Bristol, but the CE in question is in scheduled downtime. Possibly worth keeping this open whilst downtime is on to avoid a duplicate. In progress (27/2)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=101516 (24/2)</br> Bristol's perfsonar ticket. Bristol upgraded which seems to have solved some of their problems, but their other server is having trouble now. Maybe the same again will fix it? In progress (25/2)

UCL</br> https://ggus.eu/index.php?mode=ticket_info&ticket_id=95298 (1/7/13)</br> glexec at UCL. No news for a while from Ben. Daniela reminds him that the EMI3 upgrade is also imminent. On hold (26/2)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=101285 (16/2)</br> A perfsonar ticket for UCL. A power outage looks to have brutalised their box. No word yet on if Ben has been able to save it. On hold (22/2)

SHEFFIELD</br> https://ggus.eu/index.php?mode=ticket_info&ticket_id=101374 (19/2)</br> Sheffield's LHCB maxcputime ticket. Elena has set in progress but no news. In progress (25/2)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=100037 (3/1)</br> A perfsonar ticket for Sheffield, whose perfsonar needs updating. No news for a while. On hold (3/2)

LANCASTER</br> https://ggus.eu/index.php?mode=ticket_info&ticket_id=95299 (1/7/13)</br> Lancaster's glexec ticket. Whilst there's been some progress in the glexec tarball (not as much as there should be, as tarball time keeps being redirected, particularly with EMI3), no movement on the ticket. On hold (31/1)

https://ggus.eu/index.php?mode=ticket_info&ticket_id=100566 (27/1)</br> Lancaster suffering Poo Perfsonar Performance (I couldn't resist the childish alliteration). It doesn't seem to be an artificial carp (the rate has peeped over the 1Gb/s mark now and again. Looking for bottlenecks, but not had anytime to investigate. On hold (17/2)

EFDA-JET</br> https://ggus.eu/index.php?mode=ticket_info&ticket_id=97485 (21/9/13)</br> LHCB jobs failing at JET due to openssl problems. No progress for a while, after the JET guys exhausted everything. On hold (11/2)

Monday 24th February 2014, 15.00 GMT</br>

36 Open UK tickets this week, but the majority are progressing nicely (only a third of them haven't had an update in the last week, and of these all of them are "On Hold").

NGI</br> https://ggus.eu/ws/ticket_info.php?ticket=101502 (24/2)</br> ILC have ticketed the UK to inform us of their move to using cvmfs for their software area. They've included extensive instructions (and updated their VO card). The best forum to ask questions of the VO seems to be this ticket. In progress (24/2)

TIER 1</br> https://ggus.eu/ws/ticket_info.php?ticket=99556 (6/12/13)</br> NGI Argus ticket. As seen on TB-Support, good progress here but the ticket could do with some love. In progress (13/2)

https://ggus.eu/ws/ticket_info.php?ticket=101015 (5/2)</br> This CMS phedex problem looks like it can be bounced to Minnesota. I advise being proactive with the bouncing - either reassign it yourselves or solve it with a big "not a problem in our power to fix". In progress (24/2)

RALPP</br> https://ggus.eu/ws/ticket_info.php?ticket=101398 (19/2)</br> LHCB want holes poked in the RAL firewall to allow direct xrootd access to the RALPP SE - more a heads up for everyone then a ticket nag. In progress (19/2)

EDINBURGH</br> https://ggus.eu/ws/ticket_info.php?ticket=100840 (29/1)</br> Daniela has given some tips on how to tackle this APEL nagios ticket. In progress (20/2)

PERFSONAR TICKETS:</br> A quick round up of these as there are a lot of them.

Lancaster: https://ggus.eu/ws/ticket_info.php?ticket=100566</br> RHUL: https://ggus.eu/ws/ticket_info.php?ticket=101135</br> ECDF: https://ggus.eu/ws/ticket_info.php?ticket=100569</br> RALPP: https://ggus.eu/ws/ticket_info.php?ticket=101136</br> Brunel: https://ggus.eu/ws/ticket_info.php?ticket=100568</br> UCL: https://ggus.eu/ws/ticket_info.php?ticket=101285</br> Sussex: https://ggus.eu/ws/ticket_info.php?ticket=101517</br> Durham: https://ggus.eu/ws/ticket_info.php?ticket=100968</br> Bristol: https://ggus.eu/ws/ticket_info.php?ticket=101516</br>

There's a lot of them, but none are looking very neglected (yet). The one with the biggest risk of neglect is actually the Lancaster ticket! Others are soldiering on or have firm reminder dates set for their upgrade.

Tickets from the UK:</br> I had my dreams of easily searching for tickets submitted by UKers smashed: https://ggus.eu/ws/ticket_info.php?ticket=101362 So it looks like it's back to my old method of searching for "Walker", "Bauer" or "Jones" :-D Monday 17th February 14.30 GMT</br> 35 Open UK tickets this week - the number is creeping up, I think largely due to the build up of perfsonar tickets. I plan to look at these in detail next week (or maybe bring them up in the Storage meeting if that's a more appropriate forum?).

TIER 1</br> https://ggus.eu/ws/ticket_info.php?ticket=99556 (6/12/2013)</br> The NGI Argus ticket. Ewan has helped out with some successful testing, there's a general call for others to get involved if they fancy it. In progress (13/2)

https://ggus.eu/ws/ticket_info.php?ticket=100114 (8/1)</br> Jobs failing on the RAL WMS, due to the gridsite/openssl/proxy size debacle. Chris successfully tested lcgwms06 after it was updated. Now lcgwms04 and 05 have been updated and Chris has once again been asked to work his testing magic (my apologies if this is already on your to do list Chris). Waiting for reply (11/2)

https://ggus.eu/ws/ticket_info.php?ticket=101052 (6/2)</br> Biomed having trouble with one of the RAL CEs. What really caught my eye here was that Biomed are using JSaga for their job submission- do we have any other user groups using this? (This also leads me to once again question what I find interesting!). No problems with how the ticket itself. In Progress (14/2)

https://ggus.eu/ws/ticket_info.php?ticket=101015 (5/2)</br> This CMS transfer problem (between Minnesota and RAL) ticket is looking a bit ropey. Last word on Friday was that the transfers were still failing. Of course, there are two sides to every transfer failure. In progress (14/2)

https://ggus.eu/ws/ticket_info.php?ticket=101079 (9/2)</br> I don't mean to pick on the Tier 1, but you keep getting thrown the interesting problems. Another "Idiosyncrasies of the ARC CE" ticket, here we see it's oddness with publishing different default SEs for different VOs. Again, naught actually wrong with the ticket. In progress (17/2)

RHUL</br> https://ggus.eu/ws/ticket_info.php?ticket=101135 (11/2)</br> I lied earlier, and I am bringing up one of the perfsonar tickets. Any luck with getting your perfsonar updated Govind? In progress (11/2)

GLASGOW</br> https://ggus.eu/ws/ticket_info.php?ticket=98253 (21/10/2013)</br> The getting CMS to work at Glasgow epic (or would you prefer saga?). CMS have pointed out that the original problem is solved, so from their point of view the ticket can be closed when the Glasgow guys feels satisfied. The ticket is in "waiting for reply", but I'm not sure that anyone who you'd like to have input from is paying attention (the second reminder went out today). Waiting for reply (17/2)

DURHAM</br> https://ggus.eu/ws/ticket_info.php?ticket=101177 (12/2)</br> Durham's SE is publishing biomed support when Durham no longer support them. Here's wishing you good luck with purging biomed from your system! In progress (17/2)

"Submitted from the UK"</br> I've been very lax about tracking tickets submitted by us NGI_UKers (partly as I never found a good way of doing it), but Steve's submission of the dteam voms server problem ticket (101177) whilst I was writing this up has prompted me to retackle that one. Watch this space! Monday 10th February 2014, 15.00 GMT</br> 32 tickets for the UK this week.

RALPP</br> https://ggus.eu/ws/ticket_info.php?ticket=100849 (29/1)</br> This perfsonar ticket is is still just "assigned" state, don't make Duncan feel spurned, take a look at his ticket. Assigned (29/1)

TIER 1</br> https://ggus.eu/ws/ticket_info.php?ticket=99556 (6/12/13)</br> NGI argus setup. argusngi.gridpp.rl.ac.uk is setup and in the GOCDB, but what next with the ticket? In progress (30/1)

https://ggus.eu/ws/ticket_info.php?ticket=100114 (8/1)</br> A ticket from Chris W concerning job failures due to 512-bit proxie problem. Catalin asked for the update to be tested, but is this testing covered in https://ggus.eu/ws/ticket_info.php?ticket=100343? Waiting for reply (6/2)

Talking of which, can:</br> https://ggus.eu/ws/ticket_info.php?ticket=100343</br> and</br> https://ggus.eu/ws/ticket_info.php?ticket=100887 (gridsite version on the webdav LFC)</br> be closed?

And that's it really. A scan through the the solved ticket pile doesn't show anything exciting. But on the second Monday of a month I tend to overcompensate for going over all the tickets the week before, so let me know if I missed ought. Monday 3rd February 2014, 14.30 GMT</br> Only 29 open tickets in the UK at the moment. To split it further, only 4 of these are "green", three are "yellow, the rest are "red". 7 are perfsonar related tickets, the only really big group of tickets we have.

RALPP</br> https://ggus.eu/ws/ticket_info.php?ticket=100480 (23/1)</br> Some obsolete entries were being published at RALPP, Chris thinks he has fixed it though (a problem on the cluster BDII), awaiting confirmation. Waiting for reply (31/1) Update-Solved

https://ggus.eu/ws/ticket_info.php?ticket=100849 (29/1)</br> Duncan has ticketed RALPP over their perfsonar latency box, he reckons a full log partition. Looks like this ticket hasn't been noticed yet though. Assigned (30/1)

OXFORD</br> https://ggus.eu/ws/ticket_info.php?ticket=99642 (10/12)</br> Backup Voms server testing for GridPP and Southgrid VOs at Oxford. On hold (30/1)

BRISTOL</br> https://ggus.eu/ws/ticket_info.php?ticket=99910 (20/12/2013)</br> LHCB having problems with the environment at Bristol, tracked to ARC being an odd duck. The problem has been forwarded to the ARC devs. On hold (21/1)

GLASGOW</br> https://ggus.eu/ws/ticket_info.php?ticket=98253 (21/10/2013)</br> Getting CMS working at Glasgow - the ticket. Gareth has updated a magic CMS xml file using one given to him by Daniela and notes that they're still failing CMS xrootd tests. Gareth asks if the tests are critical, and if they are he pleads for help. The lack of CMS credentials is really nobbling their efforts to getting this sorted, or even digging up docs. Waiting for reply (3/2) Update- Daniela provided an update containing what I can only assume is an invocation of dark forces, Gareth has risked his immortal soul and applied it.

EDINBURGH</br> I'll probably be better off coming back to these in a few weeks time!

https://ggus.eu/ws/ticket_info.php?ticket=100840 (29/1)</br> ECDF have an APEL-Pub nagios error going on. Looks like this has flown under the radar, probably due to both Andy and Wahid having more important things on their mind right now. Assigned (29/1)

https://ggus.eu/ws/ticket_info.php?ticket=99179 (25/11/2013)</br> Glue2 obsolete entries. Plans to retire the CEs have been slowed down due to waiting on networking changes. Andy reported that he'll fix the publishing if their not in position to decommission soon. On hold (24/1)

https://ggus.eu/ws/ticket_info.php?ticket=99180 (25/11/2013)</br> Similar to above, but publishing default values. It's the same CEs at fault, so this ticket is in the same boat. On hold (4/12/2013)

https://ggus.eu/ws/ticket_info.php?ticket=99794 (16/12/2013)</br> ECDF's perfsonar boxen blocking access to their webpages. Was held up by Christmas, but no news since-probably won't be for a few weeks. On hold (16/12/2013)

https://ggus.eu/ws/ticket_info.php?ticket=100569 (28/1)</br> The perfsonar latency box has started refusing connections. On hold whist Andy's off. On hold (28/1)

https://ggus.eu/ws/ticket_info.php?ticket=95303 (1/7/2013)</br> glexec ticket. Sadly the same story as last time (or the last times).

DURHAM</br> https://ggus.eu/ws/ticket_info.php?ticket=99621 (10/12/2013)</br> Durham have a bad worker node, spotted by enmr.eu. Whilst the guys haven't had a chance to fix it, one could argue that an offlined problem is a solved problem, as it can't hurt the jobs anymore. On hold (28/1)

SHEFFIELD</br> https://ggus.eu/ws/ticket_info.php?ticket=100037 (3/1)</br> Sheffield's perfsonar box needed some site firewall holes poking for it. On the to do list is an upgrade and assimilation into the mesh due to only testing against 6 sites currently. On hold (27/1)

MANCHESTER</br> https://ggus.eu/ws/ticket_info.php?ticket=100867 (30/1)</br> Teething problems for Manchester's new perfsonar boxes. Alessandra asks Duncan if it can be closed. In progress (3/2) Update- Solved, and wasn't a site problem to begin with.

LANCASTER</br> https://ggus.eu/ws/ticket_info.php?ticket=100566 (27/1)</br> Lancaster isn't getting 10G performance out of its perfsonar boxen. My suspicion is that the NICs themselves are running slow, not the switches. Maybe I'm using the wrong drivers? In progress (3/2)

https://ggus.eu/ws/ticket_info.php?ticket=95299 (1/7/2013)</br> Lancaster's GLEXEC ticket, waiting on me getting a tarball one working. I'm currently trying out another tarball one on my test bed, but it's early days yet (it's more an exercise in documenting the errors at the mo). On hold (31/1)

https://ggus.eu/ws/ticket_info.php?ticket=100011 (31/12/2013)</br> Biomed stopped working for one of the Lancaster CEs. The ticket suffered from lack of priority (sorry biomed!). On hold (24/1)

UCL</br> https://ggus.eu/ws/ticket_info.php?ticket=95298 (1/7/2013)</br> The UCL glexec ticket. SL6 and DPM upgrades are done, Ben is just getting things settled before he starts tackling this. On hold (27/1)

QMUL</br> https://ggus.eu/ws/ticket_info.php?ticket=94746 (10/6/2013)</br> QM having trouble scrubbing the biomed out of their SE's information system. Chris submitted https://ggus.eu/ws/ticket_info.php?ticket=100290 and has put a lot of hours into this. On hold (14/1)

BRUNEL</br> https://ggus.eu/ws/ticket_info.php?ticket=100568 (28/1)</br> Brunel's perfsonar have problems. Raul plans to upgrade, and has let know his distaste that an upgrade requires a reinstall. In progress (29/1)

EFDA-JET</br> https://ggus.eu/ws/ticket_info.php?ticket=97485 (21/9/2013)</br> LHCB job problems still haunting jet. I think this ticket should be in "Waiting for reply", but I also think that I know the answer to the question (that the error message they're seeing as a red herring). In progress, should be in some other status (29/1)

TIER 1</br> https://ggus.eu/ws/ticket_info.php?ticket=100114 (8/1)</br> Chis has spotted jobs failing to get from RAL WMS to Imperial. Looked to be SSL problems. On hold awaiting RAL upgrade to the next WMS release. On hold (30/1)

https://ggus.eu/ws/ticket_info.php?ticket=100343 (16/1)</br> RAL WMS producing 512-bit proxies (occasionally). Waiting on the same release. Waiting for reply (?) (27/1)

https://ggus.eu/ws/ticket_info.php?ticket=100887 (31/1/2013)</br> Due to the same underlying issue as the above tickets , Chris asks for the gridsite package on the webdav LFC to be updated. In progress (31/1)

https://ggus.eu/ws/ticket_info.php?ticket=100507 (23/1)</br> CMS transfers failed between Caltech and RAL. The problem has eased itself, so the ticket only needs to be kept open if further investigation is warranted (as Brian pointed out). In progress (3/2)

https://ggus.eu/ws/ticket_info.php?ticket=98249 (21/10/2013)</br> CVMFS for SNO+. Almost there, creating the Sno+ tarballs to test with is taking longer then expected. On hold (29/1)

https://ggus.eu/ws/ticket_info.php?ticket=99556 (6/12/2013)</br> The new NGI Argus server (argusngi.gridpp.rl.ac.uk) has been set up in the gocdb and is online. In progress (30/1)

https://ggus.eu/ws/ticket_info.php?ticket=97025 (3/9/2013)</br> Ye olde RAL myproxy server name confusion issue. No news on this for a while, the hope is having this dealt with soon. But then the last update was nearly a month ago, so soon isn't as soon as we'd like it to be! On hold (6/1)

That's all folks. I noticed a few longstanding tickets have been solved over the course of January, so thanks for that!

Monday 27th January 2014, 15.00 GMT</br> 33 Open UK Tickets this week.

Courtesy of John Kewley's Posse of Ticket Wranglers we have:

OXFORD</br> https://ggus.eu/ws/ticket_info.php?ticket=99642 (10/12/2012)</br> Southgrid Backup Voms server testing. I suspect other, squeakier wheels have been getting the Oxford grease (where the heck am I going with this analogy?). Unless you're going to get stuck into it right now probably best to On Hold until you're actually sat down actively poking it. In progress (8/1)

SHEFFIELD</br> https://ggus.eu/ws/ticket_info.php?ticket=100037 (3/1)</br> Problems with the Sheffield Perfsonar host. Looks like the Sheffield host might need an upgrade (or at least implementation of the mesh). Again, if it doesn't look like you'll get to this soon can you On Hold. In progress (13/1)

Spotted with my own eyes:

RHUL</br> https://ggus.eu/ws/ticket_info.php?ticket=100527 (24/1)</br> An atlas ticket concerning the RHUL storage. Looks like it might have snuck in amongst the Monday morning e-mail pile. Assigned (24/1)

That's all really. We're down to 33 tickets (from 42 last week), as usual I'll be going over all of them next week, but feel free to bring any up that are particularly close to your heart in the meeting or online.

Please check your site tickets here:</br> http://tinyurl.com/cblj3ab

Monday 20th January 2014, 14.30 GMT</br> There are 42 Open UK tickets this week. Where did they all come from? Let's take a look.

EFDA-JET</br> https://ggus.eu/ws/ticket_info.php?ticket=97485 (21/9/2013)</br> LHCB jobs failing at Jet. The Jet chaps have just fixed an SSL problem at their site, so would like to see if this has fixed the LHCB problems. Waiting fore reply (20/1) Update - things are still failing, reading the error perhaps JET have picked up some wierd rpms somewhere?

(This also possibly solves the Jet gLeXeC ticket https://ggus.eu/ws/ticket_info.php?ticket=95295 UPDATE-SOLVED, the Jet guys put in a fix to JAVA to solve the keysize problem and things work now )

UCL</br> https://ggus.eu/ws/ticket_info.php?ticket=100342 (16/1)</br> Atlas are seeing transfer failures to/from UCL's dpm. Looks like an authentication problem, Ben might need a hand. In progress (20/1)

TIER 1</br> https://ggus.eu/ws/ticket_info.php?ticket=100333 (16/1)</br> Looks like this problem Tom and Chris spotted with one of the RAL WMSii has been solve, case can be closed. In progress (17/1) SOLVED

https://ggus.eu/ws/ticket_info.php?ticket=100343 (16/1)</br> But the WMSses still bring us pain, here Chris documents that the RAL ones are still producing 512-bit proxies. Chris also helpfully links two other WMS tickets. In progress (17/1)

https://ggus.eu/ws/ticket_info.php?ticket=98122 (17/10/2013)</br> But Tom provides another win, this time with the cern@school cvmfs repo. He's managed to get it working, able to put data into it, so this ticket can probably be closed too. In progress (17/1) SOLVED

https://ggus.eu/ws/ticket_info.php?ticket=100114 (8/1)</br> But then the WMS try to spoil our buzz again with another ticket. Although I believe this is the forerunner to 100343 above. In progress (16/1)

BRUNEL</br> https://ggus.eu/ws/ticket_info.php?ticket=100188 (10/1)</br> Raul has provided Brian with the database dump from his SE (it should have landed in Brian's inbox), I think this ticket can be closed if the dump looks alright. In progress (16/1)

BRISTOL</br> https://ggus.eu/ws/ticket_info.php?ticket=99910 (20/12/2013)</br> LHCB problems at Bristol, due to ARC doing strange things to the environment. A few brave fixes have been tempted, but no joy. Waiting on feedback from the ARC developers - if that takes a while this ticket will need to be On Holded. In progress (14/1)

ECDF</br> https://ggus.eu/ws/ticket_info.php?ticket=99794 (16/12/2013)</br> Poking holes in the Edinburgh firewall for the perfsonar box. Any news from the IT overlords? I understand that there's a pending Edinburgh baby boom, so I'm not sure if anyone's still about? On hold (13/1)

GLASGOW</br> https://ggus.eu/ws/ticket_info.php?ticket=98253 (21/10/2013)</br> The "getting CMS working at Glasgow" ticket. It's looking almost as neglected as my gym membership. On hold (16/12/2013)

MANCHESTER</br> https://ggus.eu/ws/ticket_info.php?ticket=97066 (5/9/13)</br> Getting the Manchester perfsonar boxes back up and running. How goes it? On hold (7/1)

SHEFFIELD</br> https://ggus.eu/ws/ticket_info.php?ticket=98594 (4/11/2013)</br> The LHCB job uploading problem at Sheffield. It seems all parties have gotten stuck, so we need to decide where to go with this. On hold (8/1)

DURHAM</br> https://ggus.eu/ws/ticket_info.php?ticket=99621 (10/12/13)</br> Just making sure this ticket, with a bad node needing offlining, isn't forgotten about. On hold (19/12)

Similar with the Durham GLEXEC ticket https://ggus.eu/ws/ticket_info.php?ticket=95302 - it was On Holded over Christmas, but Christmas was a while ago now. In fact, with Creme eggs out, it must be nearly Easter already... right?

EXTRA EXTRA</br> RALPP https://ggus.eu/ws/ticket_info.php?ticket=100401 (20/1) This nagios glexec alarm ticket which Chris quickly jumped on has been reopened on you guys. Just bringing it up as reopened tickets have a habit of sneaking under the radar. Reopened (21/1)

OXFORD</br> https://ggus.eu/ws/ticket_info.php?ticket=100348 (17/10) Atlas are getting a little ansy for some news on this ticket. And also don't seem to understand the waiting for reply state is for... Waiting for reply (21/1)


Monday 6th January, 14.30 GMT</br> Happy New Year Everybody!

38 Open UK tickets this year.

NGI</br> https://ggus.eu/ws/ticket_info.php?ticket=99854 (18/12/13)</br> The NGI ROD has a ticket open against it, Jeremy has asked for clarification but no word back yet. Waiting for reply (26/12/13)

SUSSEX</br> https://ggus.eu/ws/ticket_info.php?ticket=95165 (28/6/13)</br> Sussex's Perfsonar ticket. There's been a lot of progress thanks to new Sussex admin Matt (Hi Matt!). Duncan suggests leaving it a few days to collect data so we can see where we are with this. In progress (3/1)

https://ggus.eu/ws/ticket_info.php?ticket=99198 (26/11/13)</br> glexec ops nagios test failures at Sussex. The new Matt has gone great guns over other tickets at the site, although this problem still haunts them. If you can't see the solution maybe a mail to TB-SUPPORT is in order? In progress (31/12)

OXFORD</br> https://ggus.eu/ws/ticket_info.php?ticket=99642 (10/12/13)</br> Backup VOMS server testing ticket for Oxford. Testing was going well but I think something else came along! Needs some love. In progress (10/12/13)

BRISTOL</br> https://ggus.eu/ws/ticket_info.php?ticket=99796 (16/12/13)</br> A ticket about Bristol's perfsonar. Winnie is having the relevant holes poked into their firewalls, things are looking good (from the ticket) - actually not sure if it should be in "Waiting for Reply". In Progress (3/1)

https://ggus.eu/ws/ticket_info.php?ticket=99910 (20/12/13)</br> LHCB have spotted a CVMFS problem at Bristol. After a surprise power outage it looks like LHCB jobs aren't getting their SW_DIR set right, even though it looks like the infrastructure to set it up is in place. In progress (6/1)

GLASGOW</br> https://ggus.eu/ws/ticket_info.php?ticket=99639 (10/12/13)</br> The Glasgow VOMS Backup Server testing ticket. Some progress was made but Dave mentions that it would have to wait to the New Year before it can be finished off. On Hold (19/12/13)

https://ggus.eu/ws/ticket_info.php?ticket=100012 (31/12/13)</br> Biomed test jobs were failing at Glasgow - Dave thinks he snuffed out the problem and it looks like tests are being passed again. You might want to solve this one yourselves or at least Waiting for Reply it. In progress (6/1)

(As you can see over the holiday period GGUS tickets broke the 6-figure mark).

https://ggus.eu/ws/ticket_info.php?ticket=98253 (21/10/13)</br> A CMS ticket that evolved to "getting CMS working at Glasgow". Not much news for a while, last word was that Sam was looking at the CMS DPM redirector. On hold (3/12/13)

EDINBURGH</br> https://ggus.eu/ws/ticket_info.php?ticket=99794 (16/12/13)</br> ECDF's ticket regarding access to their Perfsonar Webpages. Andy submitted a request for the ports to be opened, but no progress was expected to nowish. On hold (16/12)

https://ggus.eu/ws/ticket_info.php?ticket=99180 (25/11/13)</br> Some of Edinburgh's CEs are publishing default values. This seems to be only affecting older CEs pointing at SL5 resources, as these will be decommissioned soon the strategy is to not bother fixing this issue. On hold (4/12)

https://ggus.eu/ws/ticket_info.php?ticket=99179 (25/11/13)</br> In a similar vein, some of the ECDF services are publishing obsolete GLUE2 entries. This appears to be the same problem as above, with the same solution. On hold (10/12)

https://ggus.eu/ws/ticket_info.php?ticket=95303 (1/7/13)</br> GleXEC ticket. No news as ECDF are a tarball site, although I see that Wahid assigned the ticket to Mark Mitchell. What did Mark do to deserve that? On hold (23/12)

DURHAM</br> https://ggus.eu/ws/ticket_info.php?ticket=99621 (10/12/13)</br> Durham had a bad WN eating enmr.eu jobs (as with Bristol, the problem seemed to be a bad environment). Ewan has flagged to be fixed after Christmas, the bad node is offline though so shouldn't be a bother. On hold (19/12/13)

https://ggus.eu/ws/ticket_info.php?ticket=95302 (1/7/13)</br> Durham's GlexEC ticket. Work paused for Chrimbo, but Ewan mentioned the lack of documentation on how to test this yourself. On hold (19/12)

SHEFFIELD</br> https://ggus.eu/ws/ticket_info.php?ticket=99955 (26/12/13)</br> Atlas jobs were failing with stag-in problems. Elena switched back to using rfio from xroot and suddenly the error rate dropped right off. Something for us to discuss in the storage/atlas meetings? In porgress (6/1)

https://ggus.eu/ws/ticket_info.php?ticket=98594 (4/11/13)</br> LHCB file uploading problems. Despite a lot of effort and retuning the NAT the problem persists. Any suggestions? In progress (16/12/13)

https://ggus.eu/ws/ticket_info.php?ticket=95301 (1/7/13)</br> glexec ticket. There was a request for a estimated deployment date from the GGUS ticket guys. On hold (29/10/13)

https://ggus.eu/ws/ticket_info.php?ticket=99793 (16/12/13)</br> Access to the Sheffield perfsonar web servers. At last word Elena was checking the iptables on her nodes. No news since. In progress (17/12)

https://ggus.eu/ws/ticket_info.php?ticket=100037 (3/1)</br> Perfsonar problem at Sheffield. In progress (5/1)

MANCHESTER</br> https://ggus.eu/ws/ticket_info.php?ticket=100038 (3/1)</br> Manchester's perfsonar hosts have hit a spot of bother. In progress (6/1)

https://ggus.eu/ws/ticket_info.php?ticket=97066 (5/9/13)</br> A ticket about Manchester's perfsonar hosts, where at last word their nodes were to be reinstalled. Not sure how this relates to 100038. On hold (5/12/13)

LANCASTER</br> https://ggus.eu/ws/ticket_info.php?ticket=95299 (1/7/13)</br> Lancaster's GlexeC ticket. Ahem. On hold (16/12/13)

https://ggus.eu/ws/ticket_info.php?ticket=100011 (31/12/13)</br> Biomed tests aren't working on one of Lancaster's CE's. Being poked. In progress (1/6)

UCL</br> https://ggus.eu/ws/ticket_info.php?ticket=95298 (1/7/13)</br> Glexec ticket. On the to do list, after the DPM upgrade is done with. On hold (18/12)

https://ggus.eu/ws/ticket_info.php?ticket=98125 (17/10/13)</br> Atlas transfer failures. The DPM is upgraded, but there maybe some space issues. Paused for the holidays. On hold (20/12/13)

QMUL</br> https://ggus.eu/ws/ticket_info.php?ticket=94746 (10/6/13)</br> The Ghost of publishing past is haunting QM's SE, where biomed support is published where it shouldn't be. Chris will still get to it when he has the time. On hold (19/12/13)

BRUNEL</br> https://ggus.eu/ws/ticket_info.php?ticket=99996 (30/12/13)</br> Nagios APEL-Pub failures. Raul has run the publisher, but it didn't seem to work. EMI3 Apel woes? In progress (6/1)

EFDA-JET</br> https://ggus.eu/ws/ticket_info.php?ticket=95295 (1/7/13)</br> glexeC ticket. Jet are nearly there, just needing to iron out some problems. On hold (11/12/13)

https://ggus.eu/ws/ticket_info.php?ticket=100045 (3/1)</br> Nagios glexec-ops test failures. One of those bugs that need ironing out. In progress (6/1)

https://ggus.eu/ws/ticket_info.php?ticket=97485 (21/9/13)</br> LHCB job failures at EFDA-JET, with a odd authentication-like error. At last word the problem persisted. On hold (9/12)

TIER 1</br> https://ggus.eu/ws/ticket_info.php?ticket=98249 (21/10/13)</br> CVMFS for Sno+. Waiting on SW tarballs from the VO. Waiting for reply (6/1)

(In other news T2K and HyperK have had their CVMFS tickets successfully closed).

https://ggus.eu/ws/ticket_info.php?ticket=99647 (10/12/13)</br> Sno+ lcg-cp timeouts at the Tier 1. There was a request for more information from the VO, just had it's second reminder last week. Waiting for reply (17/12/13)

https://ggus.eu/ws/ticket_info.php?ticket=99556 (6/12)</br> NGI Argus ticket. A server has been deployed for testing, work was paused for the holidays. In progress (30/1)

https://ggus.eu/ws/ticket_info.php?ticket=97025 (3/9)</br> The RAL Myproxy server's certificate problem, this ticket is serving as an open reminder of the issue. No recent progress, but hopefully it'll be solved this Month. On hold (6/1)

https://ggus.eu/ws/ticket_info.php?ticket=86152 (17/9/12)</br> "correlated packet-loss on perfsonar host". The last 2012 ticket. There was a plan to reinstall this on new hardware, but that was in October. On hold (18/10/13)

https://ggus.eu/ws/ticket_info.php?ticket=99768 (13/12/13)</br> Atlas source file errors. Thought to be a renaming problem, but have reoccurred. The ticket is in "waiting for reply" and I'm not sure it should be any more. Waiting for reply (29/12/13)

https://ggus.eu/ws/ticket_info.php?ticket=98122 (17/10/13)</br> cern@school's cvmfs-of-their-own ticket. Good progress on testing, Tom reports successfully uploading a tarball. Waiting for reply (6/1)