Monday 8th February 2016, 13.30 GMT
44 43 Open UK Tickets this month. Going over all of them, in kinda-alphabetical order.
NGI
118930 (18/1)
That NGI information ticket, linked to the "wrong" (according to some) information being published by the UK arc CEs. This has haunted us for a while, the consensus was the ticket is a load of B-word and not really worth worrying over - but it does warrant a response (from someone over that Steve J).. Assigned (19/1)
SUSSEX
With Matt RB off to pastures green Sussex is in limbo - I'll contact Jeremy M concerning this last week's fresh tickets.
117894 (23/11)
Atlas Consistency Checking. On hold (25/1)
118289 (10/12)
Gridpp Pilots. On hold (25/1)
118337 (14/12)
The Sussex SE was not working for Sno+ - the most serious of these older issues. On hold (25/1)
119383 (5/2)
ROD Availability ticket. Assigned (5/2)
119384 (5/2)
ROD CA distribution ticket. Maybe the two ROD tickets are correlated (i.e. if we fix this one the previous one will soothe itself?) Assigned (5/2)
RALPP
118945 (19/1)
Poor CMS SAM results for RALPP due to digi-reco work pummeling the RALPP storage - Chris has asked for the digi-reco workload to stop at RALPP, then asked for clarification as to why the site was still in unknown state. Waiting for reply (25/1) Solved - it was them, not RALPP - a restart of the SAM services looks to have cleared the issue,
118628 (5/1)
LZ Pilot deployment at RALPP. Chris has submitted a bug report to nordugrid to fix the issue (http://bugzilla.nordugrid.org/show_bug.cgi?id=3529), which was fixed and should be available in the next release. On Hold (26/1) Update - Chris is trying to get hold of a pre-release to test things.
OXFORD
119197 (29/1)
CMS has asked to change some CRAB site configs at T3s - Daniela has ashed Chris B if he's the one looking after this for Oxford. Assigned (3/2)
117892 (23/11)
Atlas consistency checks. Ewan has firmly and clearly put this on the backburner. On hold (12/1)
BIRMINGHAM
118155 (4/12)
Biomed having a clear up of their stuff on the Brummie SE. Franck has given the nod for deleting the dark data left in the DPM after their cleanup efforts. It's on their heads now! In progress (2/2)
117890 (23/11)
Another Atlas Storage Consistency Checking ticket. Any chance to have a look at this again? On hold (15/12)
GLASGOW
117706 (19/11)
Another pilot ticket, this time for pheno. Glasgow were going to roll this into their overhaul of their identity management gubbins, but the Universe messed with their plans. How goes things? On hold (15/1)
118052 (30/11)
HTTP support on the Glasgow SE. I suspect progress here took a similar shoeing to the identity management plan - but the ticket could do with an update (and maybe on holding). In Progress (4/1)
ECDF
118787 (12/1)
Another HTTP ticket. Let us know if you need a hand Marcus and Andy. Or if you're too busy to make this a priority consider on-holding it. In progress (12/1)
95303 (1/7)
Tarball glexec ticket. On hold for a very long time.
An update on this - I managed to put in some good hours on trying to build a relocatable glexec last week, successfully building from source glexec and the lcas/lcmaps stack. *But* I still have rpath problems - short of attacking every lib file with patchelf I'm not sure how to proceed, and the process is such a mess that I'm not sure if I'll ever manage to make it into a proper recipe (much like my cocoa-butter shortbread).
SHEFFIELD
119374 (5/2)
A fresh ticket from Biomed, about incorrect/no dynamic information being published at Sheffield. In progress (5/2) Update - see Steve B's post to TB-SUPPORT for clues, Elena is retackling these problems today.
118789 (12/1)
ROD Information system ticket, almost certainly caused by the same underlying issue. Is the bdii service on your CEs silently dying or failing to update?
114460 (18/6)
Gridpp Pilots. Changes were implemented but at last check things weren't working right. How goes it now? In progress (20/1)
117886 (23/11)
Atlas Storage Consistency Check ticket - any luck with this? On hold (29/1)
118764 (12/1)
HTTP support ticket for the Sheffield SE. Have you had a chance to have a look at this? In progress (25/1)
The Storage list can lend a hand fixing either of these issues (which goes for everyone of course).
MANCHESTER
118679 (7/1)
HTTP support (atlas edition). Hit a problem due to there being no outside-a-space-token space at Manchester. On Hold (12/1)
118674 (7/1)
HTTP Support (lhcb edition). As above. On Hold (12/1)
117885 (23/11)
Atlas Storage Consistency Checks - hit the same problem as the previous 2 tickets. On hold (10/1)
118603 (4/1)
A VOMS ticket rather then a site ticket, removal of the nsccs.ac.uk VO. The VO has been removed from the other UK voms servers. In progress (5/2) Update-solved
LANCASTER
95299 (1/7)
Lancaster's glexec tarball ticket. See the entry above - although I really need to update the ticket properly! Practice what you preach, Matt! On hold.
RHUL
119380 (5/2)
ROD Low availability ticket - the site is in the green now, so it's the usual 30-day wait. On hold (8/2)
117881 (23/11)
Atlas SCC ticket. On hold until March. On hold (1/2)
QMUL
117723 (19/11)
Pilots at QM. Dan's been working on this, and asked Daniela for a picture of what should be enabled[1] - Any joy? In progress (27/1)
[1] http://www.hep.ph.ic.ac.uk/~dbauer/dirac/site_pilot_status.html
117880 (23/11)
Atlas SCC ticket (wish I had started using that acronym sooner). Just waiting for the nod from atlas that all is well. Dan included the script he uses that may be useful for other STORM sites. Waiting for reply (4/2)
118985 (21/1)
QM has banished biomed from their queues until QM have a cgroupy solution to the ill-behaved biomed user jobs. Biomed have asked that the ban be reconsidered and problem users by dealt with by the VO. QM are perfectly right to say no to this, but it'll be nice to not leave them hanging. On hold (1/2)
119348 (4/2)
LHCB have noticed cvmfs issues on some nodes, which Dan couldn't replicate. Dan ponders that perhaps this is caused by ephemeral memory issues on the nodes, noting more swap being used recently. Waiting for reply (4/2)
119409 (8/2)
Fresh ROD emi glexec ticket - things exploded at the weekend but the QM admins are fighting the good fight. In progress (8/2)
IMPERIAL
119294 - but this got solved by the times I got to it (it concerned a java update breaking md5).
BRUNEL
117878 (23/11)
Atlas SCC - Raul provided an example and is waiting on atlas to give a yay or nay before deploying. Waiting for reply (18/1)
118740 (10/1)
Atlas MCORE problems at Brunel, looks to be caused by some extreme Condor oddness, Raul reconfigured Condor to give a better view. Any joy? In progress (25/1)
100IT
119002 (Reopened)
116358 (In Progress)
Not going into detail with these as I'm not sure what the crack is with 100IT.
AND FINALLY...
THE TIER 1
118809 (12/1)
The Tier 1 provided feedback on configuring memory limits for batch jobs, the ticket left open for follow up. On hold (13/1)
116864 (12/10)
CMS AAA tests failing. Andrew L reports that the CASTOR headnode has received what sounds like a big fix which will hopefully improve things. In progress (29/1)
119389 (5/2)
LHCB data transfer problem to RAL. Being looked at. In progress (5/2)
117683 (18/11)
Another publishing ticket. How we love those! This one about CASTOR not publishing GLUE 2. Code was written by Jens and Rob but not integrated, something that works might be a long way off. That was a month ago, any news since? In progress (5/2)
109358 (15/10) or (5/2)
This ticket is weird - it started in a "waiting for reply" state and was apparently issued in 2014! I can't find a ticket with this number in my records though.
Sno+ are unable to use the RAL WMS - it's being looked at. In progress (5/2)
|