|
|
Line 463: |
Line 463: |
| ===== ===== | | ===== ===== |
| <!-- ******************Edit start********************* -----> | | <!-- ******************Edit start********************* -----> |
− | '''Monday 2nd November 2015, 13.30 GMT'''<br /> | + | '''Tuesday 1oth November 2015, 10.20 GMT'''<br /> |
− | 22 Open UK Tickets this week. First Monday of the Month, so all the tickets get looked at, however run of the mill they are.
| + | Down to 13 Open tickets this week - a busy week for everyone but me it seems! |
| | | |
− | First, the link to all the UK [http://tinyurl.com/nwgrnys '''tickets'''].
| + | All the UK [http://tinyurl.com/nwgrnys '''tickets''']. |
| | | |
− | '''SUSSEX'''<br />
| |
− | [https://ggus.eu/index.php?mode=ticket_info&ticket_id=116915 116915] (14/10)<br />
| |
− | Low availability Ops ticket. On holded whilst the numbers sooth themselves. On Hold (23/10)
| |
| | | |
− | [https://ggus.eu/index.php?mode=ticket_info&ticket_id=116865 116865] (12/10)<br />
| |
− | Sno+ job submission failures. Not much on this ticket since it was set In Progress. Looks like an argus problem. How goes things at Sussex before Matt RB moves on? (We'll miss you Matt!). In progress (20/10)
| |
| | | |
− | '''RALPP'''<br />
| + | [https://vo-nagios.physics.ox.ac.uk/nagios/cgi-bin/status.cgi?host=all&servicestatustypes=16&hoststatustypes=15 '''Other VO Nagios'''] |
− | [https://ggus.eu/index.php?mode=ticket_info&ticket_id=117261 117261] (28/10)<br />
| + | |
− | Atlas jobs failing with stage out failures. Federico notices that the failures are due to odd errors - "file already existing", and that things seem to be calming themselves. He's at a loss of what RALPP can do. Checking the panda link suggests the errors are still there today. Waiting for reply (29/10)
| + | |
− | | + | |
− | '''BRISTOL'''<br />
| + | |
− | [https://ggus.eu/index.php?mode=ticket_info&ticket_id=116775 116775] (6/10)<br />
| + | |
− | Bristol's CMS glexec ticket. It looks like the solution is to have more cms pool accounts (which of course requires time to deploy). In progress (28/10)
| + | |
− | | + | |
− | [https://ggus.eu/index.php?mode=ticket_info&ticket_id=117303 117303] (30/10)<br />
| + | |
− | CMS, not Highlander fans, don't seem to believe that There can be only One (glexec ticket). Poor old Bristol seem to be playing whack-a-mole with duplicate tickets. Is there a note that can be left somewhere to stop this happening? Assigned (30/10)
| + | |
− | | + | |
− | '''ECDF'''<br />
| + | |
− | [https://ggus.eu/index.php?mode=ticket_info&ticket_id=95303 95303] (Long long ago)<br />
| + | |
− | Edinburgh's (and indeed Scotgrid's) only ticket is this tarball glexec ticket. A bit more on this later. On hold (18/5)
| + | |
− | | + | |
− | '''SHEFFIELD'''<br />
| + | |
− | [https://ggus.eu/index.php?mode=ticket_info&ticket_id=114460 114460] (18/6)<br />
| + | |
− | Gridpp (and other) VO pilot roles at Sheffield. No news for a while, snoplus are trying to use pilot roles now for dirac so this is becoming very relevant. In progress (9/10)
| + | |
− | | + | |
− | [https://ggus.eu/index.php?mode=ticket_info&ticket_id=116560 116560] (30/9)<br />
| + | |
− | Sno+ jobs failing, likely due to too many being submitted to the 10 slots that Sno+ has. Maybe a WMS scheduling problem - Stephen B has given advice. Elena asked if the problem persisted a few weeks ago. Waiting for reply (12/10)
| + | |
− | | + | |
− | [https://ggus.eu/index.php?mode=ticket_info&ticket_id=116967 116967] (17/10)<br />
| + | |
− | A ROD availability ticket, on hold as per SOP. On hold (20/10)
| + | |
− | | + | |
− | '''LANCASTER'''<br />
| + | |
− | [https://ggus.eu/index.php?mode=ticket_info&ticket_id=116478 116478] (28/9)<br />
| + | |
− | Another availability ticket. Autumn was not kind to many of us! On hold (8/10)
| + | |
− | | + | |
− | [https://ggus.eu/index.php?mode=ticket_info&ticket_id=116882 116882] (13/10)<br />
| + | |
− | Enabling pilot snoplus users at Lancaster. Shouldn't have been a problem, but turned into a bit of a comedy/tragedy of errors by yours truly mucking up. Hopefully fixed now- thanks to Daniela for her patience. In progress (2/11)
| + | |
− | | + | |
− | [https://ggus.eu/index.php?mode=ticket_info&ticket_id=95299 95299] (Far far away)<br />
| + | |
− | glexec tarball ticket. There's been a lot of communication with the glexec devs about this - the hopefully last hurdle is sorting out the RPATHs for the libraries. It's not a small hurdle though... On hold (2/11)
| + | |
− | | + | |
− | '''QMUL'''<br />
| + | |
− | [https://ggus.eu/index.php?mode=ticket_info&ticket_id=117151 117151](23/10)<br />
| + | |
− | A ticket about jumbo frame problems, submitted to QM. After Dan provided some education the user replied, in that he only sees this problem at two atlas sites. But he is contacting the network admins at his institution to see if it is their end. On hold (29/10)
| + | |
− | | + | |
− | [https://ggus.eu/index.php?mode=ticket_info&ticket_id=117011 117011] (19/10)<br />
| + | |
− | ROD ticket for glue-validate errors. Went away for a while after Dan re-yaimed his site bdii, but possibly back again. Daniela suggests re-running the glue-validate test. Reopened (2/11)
| + | |
− | | + | |
− | [https://ggus.eu/index.php?mode=ticket_info&ticket_id=116689 116689] (6/10)<br />
| + | |
− | Another ROD ticket, where Ops glexec test jobs are seemingly timing out for QM (this is the ticket Daniela mentioned on the ops mailing list). Dan noted that with the cluster half full tests were passing, suggesting some kind of load correlation (but as he also notes - what's getting loaded and causing the problem - Batch, CE or WNs?). Kashif reckons the argus server, and suggests a handy glexec time test which he posted. In progress (2/11)
| + | |
− | | + | |
− | '''BRUNEL'''<br />
| + | |
− | [https://ggus.eu/index.php?mode=ticket_info&ticket_id=117324 117324] (2/11)<br />
| + | |
− | A fresh looking ROD ticket - Raul had to restart the BDII and hopefully that got it. In progress (2/11)
| + | |
− | | + | |
− | '''100IT'''<br />
| + | |
− | [https://ggus.eu/index.php?mode=ticket_info&ticket_id=116358 116358] (22/9)<br />
| + | |
− | Missing Image at 100IT. 100IT have asked for more details, no news since. Waiting for reply (19/10)
| + | |
− | | + | |
− | '''THE TIER 1'''<br />
| + | |
− | [https://ggus.eu/index.php?mode=ticket_info&ticket_id=116866 116866] (12/10)<br />
| + | |
− | Snoplus pilot enablement (not actually a word) at the Tier 1. New accounts were being requested after some internal discussion. On hold (19/10)
| + | |
− | | + | |
− | [https://ggus.eu/index.php?mode=ticket_info&ticket_id=116864 116864] (12/10)<br />
| + | |
− | CMS AAA tests failing (the submitter notes "again..."). There are some oddities with other sites, which might be remote problems, but Andrew notes that previous manual fixes have been overwritten which likely explains why problems came back. In progress (does it need to be waiting for a reply?) (26/10)
| + | |
− | | + | |
− | [https://ggus.eu/index.php?mode=ticket_info&ticket_id=117171 117171] (24/10)<br />
| + | |
− | LHCB had problems with an arc CE that was misbehaving for everyone. Things were fixed, and this ticket can now be closed. Waiting for reply (can be closed) (27/10)
| + | |
− | | + | |
− | [https://ggus.eu/index.php?mode=ticket_info&ticket_id=117277 117277] (30/10)<br />
| + | |
− | Atlas have spotted "bring online timeout has been exceeded). This appears to be a mixture of problems adding up, such as a number of borken disk nodes and heavy write access by atlas. In progress (2/11)
| + | |
− | | + | |
− | [https://ggus.eu/index.php?mode=ticket_info&ticket_id=117248 117248] (28/10)<br />
| + | |
− | I believe related to the discussion on tb-support, this ticket requests that new SRM host certs that meet the requirements specified be requested for the RAL SRMs. Jens was on it, and the new certs are ready to be deployed. In progress (30/10)
| + | |
− | | + | |
− | [https://vo-nagios.physics.ox.ac.uk/nagios/cgi-bin/status.cgi?host=all&servicestatustypes=16&hoststatustypes=15 '''Other VO Nagios'''] - some badness at Sussex, but they have a ticket open for that. | + | |
| | | |
| <!-- ******************Edit stop********************* -----> | | <!-- ******************Edit stop********************* -----> |