|
|
Line 391: |
Line 391: |
| ===== ===== | | ===== ===== |
| <!-- ******************Edit start********************* -----> | | <!-- ******************Edit start********************* -----> |
− | '''Monday 7th April 2014, 13.30 BST'''<br /> | + | '''Monday 14th April 2014, 15.30 BST'''</ br> |
| + | No ticket update from Matt next week. |
| | | |
− | 32 Open UK tickets this week.
| + | 33 Open UK tickets today. |
| | | |
− | '''No site in particular.'''<br /> | + | '''NGI''' (No Geezers In-particular in this case)</ br> |
− | https://ggus.eu/index.php?mode=ticket_info&ticket_id=101502 (24/2)<br /> | + | https://ggus.eu/index.php?mode=ticket_info&ticket_id=101502 (24/2)</ br> |
− | The ILC cvmfs rollout ticket. Glasgow, Oxford, Durham and Bristol were missing at last head count - although Glasgow are mid-rollout and should be fully deployed any day now (if not already). I think Oxford are in a similar boat? As JK points out, we've got to the point where probably need to on hold the ticket whilst I harass the last few stragglers. In progress (3/4)
| + | ILC cvmfs ticket, No change since last week really, after tomorrows meeting I'll on hold this ticket until I'm back next week. In progress (3/4) |
| | | |
− | https://ggus.eu/index.php?mode=ticket_info&ticket_id=103043 (7/4)<br /> | + | https://ggus.eu/index.php?mode=ticket_info&ticket_id=103043 (7/4)</ br> |
− | Squire Whyntie has asked for cern@school registration on the Imperial Dirac. Janusz has done so and Tom confirmed it works and cab be solved. If only all things were solved so quickly! Assigned (7/4)
| + | Tom's ticket requesting cern@school access to the IC Dirac server. It's all done, the ticket just needs closing (and whilst I'm happy to stick my nose into tickets I won't close or reopen them). Assigned(!) (7/4) |
| | | |
− | '''SUSSEX'''<br />
| + | https://ggus.eu/index.php?mode=ticket_info&ticket_id=103197 (9/4)</ br> |
− | https://ggus.eu/index.php?mode=ticket_info&ticket_id=102810 (28/3)<br /> | + | Chris W has spotted several instances where the old myproxy server shows up in the online documentation. Andrew has tried to edit https://www.gridpp.ac.uk/deployment/users/myproxy.html but can't get access - Daniela suggested asking the hosting site but maybe Tom has access? Waiting for Reply (9/4) |
− | The new Sussex EMI2 upgrade ticket. Matt RB copied the Sussex plan over from the original ticket. Daniela cleared up the mystery of what happened to the original ticket (dashboard shenanigans) and posted some useful instructions for the BDII upgrade. In progress (1/4)
| + | |
| | | |
− | '''RALPP'''<br />
| + | https://ggus.eu/index.php?mode=ticket_info&ticket_id=98249 (21/10/2013)</ br> |
− | https://ggus.eu/index.php?mode=ticket_info&ticket_id=102990 (3/4)<br /> | + | The Sno+ CVMFS ticket. Could some of the progress mentioned last week please be put into the ticket? In progress (26/3) |
− | Duncan's unending perfsonar vigilance discovered a a problem with the RALPP latency box. Ian reports firewall problems that have been solved, so it looks like this one can be closed (if all is well). In progress (can be closed) (4/4) ''Not quite out of the woods yet after all, Ian spotted and fixed a few more problems, Duncan has spotted something else away.''
| + | |
| | | |
− | https://ggus.eu/index.php?mode=ticket_info&ticket_id=102953 (24/3)<br /> | + | '''QMUL'''</ br> |
− | CMS glidein hammercloud jobs not running at the site (specifically their defunct cream CEs)- Chris points out another ticket (https://ggus.eu/index.php?mode=ticket_info&ticket_id=102915) essentially detailing the same problem (just for different job types). Probably worth on holding this one whilst waiting on the other, as it looks like the problems are CMS side. In progress (2/4)
| + | https://ggus.eu/index.php?mode=ticket_info&ticket_id=103028 (6/4)</ br> |
| + | Chris ran these atlas job failures down and discovered they were due to the jobs going over their memory quotas. What I didn't like the looks of was how it the jobs themselves requesting these amounts of memory. Atlas says can be solved, but something to watch out for. In progress (11/4) |
| | | |
− | '''OXFORD'''<br /> | + | '''GLASGOW'''</ br> |
− | https://ggus.eu/index.php?mode=ticket_info&ticket_id=103027 (5/4)<br /> | + | https://ggus.eu/index.php?mode=ticket_info&ticket_id=101565 (26/2)</ br> |
− | LHCB pilots aborting, Kashif asks if the problem persists, the ticket fairy set the ticket to Waiting for Reply (5/4) ''SOLVED'' | + | As mentioned last week, LHCB have got back to Glasgow deciding that MaxCPUTime needs to be set to something, Sam respectfully maintains his stance. Steve B links a interesting ticket to the cream devs: https://ggus.eu/index.php?mode=ticket_info&ticket_id=97721 On Hold (8/4) |
| | | |
− | https://ggus.eu/index.php?mode=ticket_info&ticket_id=102469 (19/3)<br />
| + | '''"EMI UPGRADE" tickets.'''</ br> |
− | cvmfs for t2k. I think this has fallen through some cracks, no word for a while. In progress (21/3)
| + | |
| | | |
− | '''BRISTOL'''<br /> | + | '''TIER 1'''</ br> |
− | https://ggus.eu/index.php?mode=ticket_info&ticket_id=102205 (14/3)<br /> | + | https://ggus.eu/index.php?mode=ticket_info&ticket_id=102611</ br> |
− | Bristol's EMI2 upgrade ticket. Not much news, although there was a positive update from Winnie that looks like the April deadline will be made. In progress (4/4)
| + | Kashif points out that the NGI argus isn't in the site bdii, which is the probably cause of the test failures. The other two problem servers are due to be decommissioned, so all good here. In progress (14/4) |
| | | |
− | '''GLASGOW'''<br /> | + | '''DURHAM'''</ br> |
− | https://ggus.eu/index.php?mode=ticket_info&ticket_id=102914 (1/4)<br /> | + | https://ggus.eu/index.php?mode=ticket_info&ticket_id=103722 (14/4)</ br> |
− | An atlas ticket, detailing some odd transfer behaviour for some files, likely attributed to some off tcp window settings on a disk server. There was a similar looking (although possibly not identical) problem at RHUL (https://ggus.eu/index.php?mode=ticket_info&ticket_id=102311). Some interesting stuff. In progress (4/4) ''Sam updated the ticket, with no more "sub-optimally tuned" disk pools. I think it should be set to waiting for reply though''
| + | A very fresh alarm ticket for Durham's CE and SE. Sorry you guys have to do this dance again! Assigned (14/4) |
| | | |
− | https://ggus.eu/index.php?mode=ticket_info&ticket_id=102202 (14/3)<br /> | + | '''EDINBURGH'''</ br> |
− | Not as interesting, Glasgow's EMI upgrade ticket. Chugging along, last word was from David a little while back about having watching some atlas canary jobs running on the EMI3 worker nodes. How did these pan out? In progress (27/3) ''Gareth updates that progress is slow but steady, draining nodes is taking a while.''
| + | https://ggus.eu/index.php?mode=ticket_info&ticket_id=102201 (14/3)</ br> |
| + | Andy notes that the links to the alarms given in the ticket appear to be broken. How gos the upgrade in general? On Hold (7/4) |
| | | |
− | https://ggus.eu/index.php?mode=ticket_info&ticket_id=101565 (26/2)<br /> | + | '''RHUL'''</ br> |
− | LHCB asked Glasgow to publish their max CPU time. Not wanting to be made liars of, Sam pointed out why they didn't (shouldn't) do this. This has seemed to send LHCB back to the drawing board, so the ticket is on hold. On Hold (12/3)
| + | https://ggus.eu/index.php?mode=ticket_info&ticket_id=102189 (14/3)</ br> |
| + | I think RHUL just has some CEs to upgrade, have you done the site BDII? The list of services that need to be upgraded isn't exhaustive. On hold (21/3) |
| | | |
− | '''EDINBURGH'''<br /> | + | '''SUSSEX'''</ br> |
− | https://ggus.eu/index.php?mode=ticket_info&ticket_id=102201 (14/3)<br /> | + | https://ggus.eu/index.php?mode=ticket_info&ticket_id=102810 (28/3)</ br> |
− | The ECDF EMI upgrade ticket. Not much to report here, although the apel box through a wobbly as well, Andy's on it. In progress (2/4)
| + | You guys put in a good plan, did it survive contact with the enemy? In progress (1/4) |
| | | |
− | https://ggus.eu/index.php?mode=ticket_info&ticket_id=95303 (1/7/13)<br /> | + | '''GLASGOW'''</ br> |
− | glexec ticket. Word on that later. On Hold (27/1)
| + | https://ggus.eu/index.php?mode=ticket_info&ticket_id=102202 (14/3)</ br> |
| + | The Glasgow list of services to upgrade was long, but that's just a reflection of how much stuff they run. Gareth gave a good update last week, so there's naught to worry about here (hopefully I didn't just curse you...). In Progress (8/4) |
| | | |
− | '''DURHAM'''<br /> | + | '''BRISTOL'''</ br> |
− | https://ggus.eu/index.php?mode=ticket_info&ticket_id=102199 (14/3)<br /> | + | https://ggus.eu/index.php?mode=ticket_info&ticket_id=102205 (14/3)</ br> |
− | Another EMI upgrade deadline ticket. A plan is in place and the work is underway. On Hold (24/3)
| + | Winnie sounded confident that upgrade will be done by the end of April (and we aren't halfway though the month yet). In progress (4/4) |
| | | |
− | '''SHEFFIELD'''<br /> | + | '''UCL'''</ br> |
− | https://ggus.eu/index.php?mode=ticket_info&ticket_id=100037 (3/1)<br /> | + | https://ggus.eu/index.php?mode=ticket_info&ticket_id=102193 (14/3)</ br> |
− | Sheffield's perfsonar having trouble. Elena upgraded and got the Sheffield IT guys to open port 8086 - it looks like she's nailed the problem and has asked for confirmation. Waiting for reply (7/4) (And before I even finished the review, the ticket was solved).
| + | Ben set a reminder date for the 31st of March, no news since then. On hold (14/3) |
| | | |
− | '''LANCASTER'''<br /> | + | '''EFDA-JET'''</ br> |
− | https://ggus.eu/index.php?mode=ticket_info&ticket_id=95299 (1/7)<br /> | + | https://ggus.eu/index.php?mode=ticket_info&ticket_id=102166 (14/3)</ br> |
− | GLEXEC ticket. The tarball glexec isn't going well (no thanks to EMI3 taking up the last 6 weeks of tarball time). I might have to admit defeat (but will ask the devs for help before I do). On hold (4/4)
| + | It's just the Jet DPM that looks like it needs upgrading. If they've kept it up to date then this upgrade is trivial. Hope to be done by the end of April. On hold (24/3) |
| | | |
− | https://ggus.eu/index.php?mode=ticket_info&ticket_id=100566 (27/1/13)<br />
| |
− | Lancaster's Poo Perfsonar Performance (I said I wouldn't use that alliteration again, I lied). Using "normal" iperf to probe the boxes I see no 1Gb bottlenecks in my network, could be problem be software? On hold (7/4)
| |
− |
| |
− | '''UCL'''<br />
| |
− | https://ggus.eu/index.php?mode=ticket_info&ticket_id=101285 (16/2)<br />
| |
− | UCL's perfsonar also having difficulty, although their difficulty is caused by the hardware going kaput on them. Ben is chasing up Dell for new bits. In progress (3/4)
| |
− |
| |
− | https://ggus.eu/index.php?mode=ticket_info&ticket_id=102193 (14/3)<br />
| |
− | EMI upgrade ticket. Ben put in a brief plan, but the reminder date has passed. How goes it? The bdii and DPM are fairly straightforward to upgrade. On hold (14/3)
| |
− |
| |
− | https://ggus.eu/index.php?mode=ticket_info&ticket_id=95298 (1/7/13)<br />
| |
− | GlexeC ticket. No news for a while, is this work to be rolled into the EMI3 upgrade? On hold (27/1)
| |
− |
| |
− | '''RHUL'''<br />
| |
− | https://ggus.eu/index.php?mode=ticket_info&ticket_id=102189 (14/3)<br />
| |
− | RHUL's EMI upgrade ticket. Not much news here. On hold (21/3)
| |
− |
| |
− | '''QMUL'''<br />
| |
− | https://ggus.eu/index.php?mode=ticket_info&ticket_id=103028 (6/4)<br />
| |
− | Atlas seeing production jobs failing due to pilot errors. Chris asked if production job options have changed recently? The ticket fairy struck again, setting the ticket to Waiting for reply (although he's less sure if that was the intention of Chris' reply). In progress (7/4) ''Atlas replied saying that they don't think there has been any job changes. Full prod disk is making things even cloudier, but Dan has asked for clarification on what an error message actually means - "!!FAILED!!1999!! Job killed by signal 24: Signal handler has set job result to FAILED, ec = 1204" ''
| |
− |
| |
− | https://ggus.eu/index.php?mode=ticket_info&ticket_id=101639 (26/2)<br />
| |
− | RFC3820 proxy problems at QM (and elsewhere). JK has asked the submitter for his ticket intentions. Set to Waiting for reply by our friend, the ticket fairy. (1/4)
| |
− |
| |
− | (Please remember to set your tickets to Waiting for Reply after asking a question to the submitter. Don't make me spend yet another Monday afternoon referring to myself as the ticket fairy.)
| |
− |
| |
− | '''IMPERIAL'''<br />
| |
− | https://ggus.eu/index.php?mode=ticket_info&ticket_id=102888 (1/4)<br />
| |
− | Biomed asked for access to their cvmfs repo to be rolled out at IC. Daniela has said fine but asked that they completely migrate to it within 3 months (nfs or cvmfs). Daniela has completed the rollout and asked biomed to test. Waiting for reply (7/4) ''Biomed have got back saying that they've launched some test jobs, but expect it might take a while for them to run. I think they also were kinda asking if Imperial would give them some leyway on moving wholey to cvmfs.''
| |
− |
| |
− | '''EFDA-JET'''<br />
| |
− | https://ggus.eu/index.php?mode=ticket_info&ticket_id=102166 (14/3)<br />
| |
− | The JET EMI upgrade ticket. There was a hope to upgrade before the end of April. On hold (24/3)
| |
− |
| |
− | https://ggus.eu/index.php?mode=ticket_info&ticket_id=97485 (21/9/13)<br />
| |
− | SSL type errors for LHCB at JET. No progress on this for a while, the problem somehow survived the move to SL6/EMI3. On hold (11/2)
| |
− |
| |
− | '''TIER 1'''<br />
| |
− | https://ggus.eu/index.php?mode=ticket_info&ticket_id=102611 (24/3)<br />
| |
− | The Tier 1 EMI upgrade ticket. There seem to be some false positives on the list, which could do with clarification which these are (especially due to the dashboard noise on the ticket). In progress (27/3)
| |
− |
| |
− | https://ggus.eu/index.php?mode=ticket_info&ticket_id=98249 (21/10/13)<br />
| |
− | CVMFS for SNO+. Matt reported that the collaboration has given permission to have their software on cvmfs, and hoped to have tarballs ready for last week. Has there been any progress offline? In progress (26/3) ''Update - Squire Whyntie informed me that this is being actively worked on offline, with Tom kindly providing assisstance.''
| |
− |
| |
− | https://ggus.eu/index.php?mode=ticket_info&ticket_id=101079 (9/2)<br />
| |
− | ARC CEs publishing the wrong DefaultSE. Andrew has hacking this on his todo list, but bumped this issue down the list (which is fine, as it's low priority) . In progress (1/4)
| |
− |
| |
− | https://ggus.eu/index.php?mode=ticket_info&ticket_id=99556 (6/12/13)<br />
| |
− | The NGI argus ticket. I'm pretty sure that this can be closed, as argusngi.gridpp.rl.ac.uk is setup and tested by several sites- so all looks well here. On Hold (21/3)
| |
− |
| |
− | https://ggus.eu/index.php?mode=ticket_info&ticket_id=101968 (11/3)<br />
| |
− | Atlas deletion errors at the Tier 1. The problem is known, but not well understood, and sadly persists (last set of errors reported on the 4th). Alastair has put in a good explanation of the symptoms. On hold (4/4)
| |
| | | |
| <!-- ******************Edit stop********************* -----> | | <!-- ******************Edit stop********************* -----> |