Monday 5th October 2015, 14.15 BST
22 Open UK Tickets this month, all of them, Site by Site:
SUSSEX
116136 (9/9)
Sussex got a snoplus ticket for a high number of job failures, although simple test jobs ran okay. Matt asks if the problem persists, the reply was a resounding "not sure". In progress (think about closing) (21/9)
RALPP
116652(1/10)
A ticket from CMS, about some important Phedex ritual that must occur on the 3rd of November, when the stars are right. The ticket needs some confirmation and feedback, plus the nomination of one site acolyte to receive the DBParam secrets from CMS - but the ticket only got assigned to sites this morning. Assigned (5/10)
BRISTOL
116651(1/10)
Same as the RALPP ticket, Winnie has volunteered Dr Kreczko for the task. In progress (5/10)
ECDF
95303(Long, long ago)
glexec ticket. On hold (18/5)
DURHAM
116576(1/10)
Atlas ticket asking Durham to delete all files outside of the datadisk path. Oliver asks what this means for the other tokens (I think they can be sacrificed to feed datadisk, but Brian et al can confirm that). Waiting for reply (5/10)
SHEFFIELD
116560(30/9)
Sno+ jobs having trouble at Sheffield. Looks like a proxy going stale problem as only 10 Sno+ jobs at a time can run at Sheffield. Matt M asks if/how the WMS can be notified to stop sending jobs in such a case. In progress (30/9)
114460(18/6)
Gridpp Pilot roles. No news on this for a while, after the last attempt seemed to not quite work. In progress (30/7)
MANCHESTER
116585(1/10)
Biomed ticketed Manchester with problems from their VO nagios box - which Alessandra points out being due to there being no spare cycles for biomed to run on. Assigned (can be put on hold or closed?) (1/10)
LIVERPOOL
116082(7/9)
A classic Rod Availability ticket. On Hold (7/9)
LANCASTER (a little embarrassing that my own site has the most tickets)
116478 (28/9)
Another availability ticket, this time for Lancaster (which has been through the wars in September). Still trying to dig our way out, but even the Admin's broke. On hold (5/10)
116676 (5/10)
Another ROD ticket, Lancaster's not quite out of the woods. We think WMS access is somewhat broken. We have no idea about the sha2 error. In progress (5/10)
116366 (22/9)
Sno+ spotted malloc errors at Lancaster. The problems seemed to survive one batch of fixes, but I asked again if they still see problems after running a good number of jobs over the weekend. Waiting for reply (5/10)
95299 (In a galaxy far, far way)
glexec ticket. This was supposed to be done last week, after I had figured out "the formula" - but then last week happened. On hold (5/10)
QMUL
115959 (31/8)
LHCB job errors at QM, with a 70% pilot failure rate on ce05. Dan couldn't see where things are breaking (only that the CE wasn't publishing to APEL- and asks if this is the cause of the problem?) Waiting for reply (5/10)
116662 (5/10)
LHCB job failures on ce05 - almost certainly a duplicate of 115959, but it might have some useful information in it. Assigned (probably can be closed as a duplicate) (5/10)
IMPERIAL
116650 (1/10)
Imperial's invitation to the CMS Phedex DBParam ritual. Daniela's on it, as well as the other CMS sites. On hold (5/10)
BRUNEL
116649 (1/10)
Brunel's ticket for the great DBParam alignment of 2015. On hold (5/10)
116455 (28/9)
A CMS request to change the xrootd monitoring configs. Did you get round to doing this last week Raul? In progress (29/9)
EFDA-JET
115448 (3/8)
Biomed having trouble tagging the jet CE. The Jet admins think this is the same underlying issues as their other ticket
115496. In progress (25/9)
115496 (5/8)
Biomed unable to remove files from the jet SE. There are clues that suggest that some dns oddness is the cause, but it's not clear. In progress (18/9)
100IT
116358 (22/9)
Ticket complaining about a missing image at the site. Some to and fro, the ball is back in the site's court. In progress (2/10)
TIER 1
116618 (1/10)
The Tier 1's CMS DBParam ritual ticket. In progress (5/10)
Let me know if I missed ought.
T'OTHER VO NAGIOS
At time of writing things looka a bit rough at QM, Liverpool (just getting over their downtime) and for Sno+ at Sheffield (likely related to their ticket).
|