Difference between revisions of "Operations Bulletin Latest"
m (→URL fix) |
(→) |
||
Line 537: | Line 537: | ||
===== ===== | ===== ===== | ||
<!-- ******************Edit start********************* -----> | <!-- ******************Edit start********************* -----> | ||
− | ''' | + | '''Monday 8th April 2018, 14.00 BST'''<br /> |
+ | 38 Open Tickets this month. | ||
+ | |||
+ | '''RALPP'''<br /> | ||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=139539 139539] (5/2)<br /> | ||
+ | A ticket from Duncan regarding blocked perfsonar ports. The host is failing to talk to itself due to odd reasons. Any luck finding the time to look again at this? Duncan posted a few hints a month ago. In Progress (14/2) | ||
+ | |||
+ | '''OXFORD'''<br /> | ||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=140134 140134] (11/3)<br /> | ||
+ | Atlas jobs seeing an "Unspecified grid manager error". The classic grid error. Prompted the discussion about atlas asking for jobs not to be killed due to using too much memory. Oxford have raised their default memory per job to 4GB (and they won't kill a job unless it uses 1.5 times that). Waiting to see if that fixes things. How does it look atlas side? Waiting for reply (2/4) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=138647 138647] (3/12/2018)<br /> | ||
+ | T2K DFC migration ticket. If I had left it a few hours this ticket would be closed, Kashif has successfully renamed the files without having to do anything DOMEy. Daniela is re-registering the files in the DFC and hopefully this will be sorted soon. In progress (8/4) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=131615 131615] (3/11/2017)<br /> | ||
+ | Oxford's IPv6 ticket. Kashif provided a light update last month, still not much movement but there are plans made. On hold (13/3) | ||
+ | |||
+ | '''BIRMINGHAM'''<br /> | ||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=140573 140573] (4/4)<br /> | ||
+ | A request from biomed to update the .lsc information. Is this even relevant for your site anymore? Either way the ticket needs acknowledging (or straight up "not relevant to our site"-ing). Assigned (4/4) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=140584 140584] (4/4)<br /> | ||
+ | A ROD ticket for your cream CE Birmingham only has to not get these tickets. Looks like a simple lcg-CA (or whatever they're called these days) updated needed on your WN. As this ticket also hasn't been noticed I wonder if Mark was off work on the 4th? Assigned (4/4) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=131612 131612] (3/11/2017)<br /> | ||
+ | Birmingham's v6 ticket. Things were progressing nicely but slowly back in February. Any time to have any joy on this since then? In progress (5/2) | ||
+ | |||
+ | '''GLASGOW'''<br /> | ||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=140151 140151] (12/3)<br /> | ||
+ | LHCB jobs seeing "can't start new thread" errors. The problem isn't understood as far as I can see but it appeared to disappear on its own - on hold to see if it comes back. On hold (27/3) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=140222 140222] (15/3)<br /> | ||
+ | MICE DFC migration ticket. Daniela repoked to ask for an ETA on getting those checksums. In progress (15/3) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=131611 131611] (3/11/2017)<br /> | ||
+ | Glasgow's v6 Ticket. Any news in the last two months about why your v6 performance is pants? In progress (5/2) | ||
+ | |||
+ | '''DURHAM'''<br /> | ||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=131609 131609] (3/11/2017)<br /> | ||
+ | Just a v6 ticket at Durham. Really could do with an update. On Hold (4/12/2018) | ||
+ | |||
+ | '''SHEFFIELD'''<br /> | ||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=138649 138649] (3/12/2018)<br /> | ||
+ | T2K DFC migration ticket. Elena is asking Kashif for his secrets in renaming DPM files. On Hold (8/4) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=131608 131608] (3/11/2017)<br /> | ||
+ | Sheffield's v6 ticket. Really, really, really needs an update. And probably putting on hold if there are no postive updates. In progress (30/10/2018) | ||
+ | |||
+ | '''MANCHESTER'''<br /> | ||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=131607 131607] (3/11/2017)<br /> | ||
+ | Just the v6 ticket here. Please please can we get an update here? And again, if there is not positive news expected can it be set on hold too? In progress (3/12/2018) | ||
+ | |||
+ | '''LIVERPOOL'''<br /> | ||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=139411 139411] (30/1)<br /> | ||
+ | What's the plan for this biomed space token ticket? On hold (1/2) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=138648 138648] (3/12/2018)<br /> | ||
+ | T2K DFC migration ticket. The VO would like some folders just plain deleted to help clear things up. In progress (8/4) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=131606 131606] (3/11/2017)<br /> | ||
+ | Liverpool's v6 ticket. As we're at the start of a new FY are plans being formed for the new network upgrades? On hold (6/2) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=139683 139683] (14/2)<br /> | ||
+ | Not really a problem, the decommissioning ticket for the site's SL6 CE. Nice and by the book. In progress (12/3) | ||
+ | |||
+ | '''UCL'''<br /> | ||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=139101 139101] (8/1)<br /> | ||
+ | ROD APEL-Pub alarm for UCL's VAC cluster. I think the site is a bit dead in the water here - has there been any news or progress? At last check Ben couldn't install ViaB so was stuck. In progress (4/3) | ||
+ | |||
+ | '''RHUL'''<br /> | ||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=131603 131603] (3/11/2017)<br /> | ||
+ | RHUL's v6 ticket. No news on this since January, is there still no news? Are the right people being prodded and poked? In progress (23/1) | ||
+ | |||
+ | '''QMUL'''<br /> | ||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=140190 140190] (14/3)<br /> | ||
+ | LHCB seeing FTS problems. After a lot of poking it seemed the problem went away after a few service and node restarts for unrelated reasons. Brian notices that the FTS monitoring is looking okay now - can this ticket be closed? Or do we want to watch it a bit longer? In progress (2/4) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=140628 140628] (8/4)<br /> | ||
+ | Atlas jobs failing because the work directory is too large (wait, what? Too much space?). Fresh in today - Dan is on the atlas cloud support list getting advice as I type. Assigned (8/4) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=138364 138364] (19/11/2018)<br /> | ||
+ | T2K DFC migration ticket. Daniela has repoked the ticket with urgency! In progress (1/3) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=134573 134573] (17/4/18)<br /> | ||
+ | CMS request to install singularity. Dan rolled this into the C7 migration, how's that going? On hold (5/11/2018) | ||
+ | |||
+ | '''IMPERIAL'''<br /> | ||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=138359 138359] (19/11/2018)<br /> | ||
+ | T2K DFC migration master ticket. | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=140198 140198] (14/3)<br /> | ||
+ | MICE DFC migration master ticket. | ||
+ | |||
+ | No real tickets at IC - just master tickets tracking other issues. | ||
+ | |||
+ | '''BRUNEL'''<br /> | ||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=140619 140619] (8/4)<br /> | ||
+ | CMS transfers of a file failing, due to a classic disk server down error. Raul is working to bring it back from the brink. In progress (8/4) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=140223 140223] (15/3)<br /> | ||
+ | MICE DFC migration ticket. Progressing nicely, just one more job to do (a big file move). In progress (8/4) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=140598 140598] (4/4)<br /> | ||
+ | Another CMS ticket due to the same server being down. In progress (5/4) | ||
+ | |||
+ | '''THE TIER 1'''<br /> | ||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=140599 140599] (5/4)<br /> | ||
+ | LHCB data access problems. Restarting a castor server seems to have got things going again, can this ticket be closed then? In progress (5/4) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=140577 140577] (4/4)<br /> | ||
+ | Really a ticket to LHCB, George noticed loads of file requests coming in with no service class defined, which has the potential to cause issues. It is being looked at (Matt code for I scanned the ticket and got lost), as of Friday there still were a few files coming in with these symptoms. In progress (5/4) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=140447 140447] (27/3)<br /> | ||
+ | The ever vigilant Duncan spotted v6 outbound packet loss on the RAL perfsonar. Investigations went underway - any results? In progress (2/4) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=140220 140220] (15/3)<br /> | ||
+ | MICE DFC migration ticket. Daniela has batted questions at both the VO and the Tier 1 today, so it's still ongoing. In progress (8/4) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=138665 138665] (4/12/18)<br /> | ||
+ | MICE LFC problem ticket. With the progress on the previous issue I think this can be closed (to either solved or unsolved)? On Hold (30/1) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=140589 140589] (4/4)<br /> | ||
+ | A case of killed LHCB pilots at RAL. James confessed to an accidental docker restart across many nodes that could be responsible for the carnage. Some more investigation is being done, are jobs still being killed? In progress (4/4) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=139672 139672] (13/2)<br /> | ||
+ | No LIGO pilots running at RAL. Efforts were made but sadly no fruit was born from them - and there's been no news for a month. Is this being looked at offline? The last update from the VO had a pleading tone to it. In progress (5/3) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=138033 138033] (1/11/2018)<br /> | ||
+ | The old atlas singularity jobs failing ticket. This was closed for a while, but Alessandra has reopened the case - the last batch of tests revealed some issues with the RAL configuration. Reopened (6/4) | ||
− | |||
<!-- ******************Edit stop********************* -----> | <!-- ******************Edit stop********************* -----> |
Revision as of 15:07, 8 April 2019
Week commencing Monday 25th February 2019 |
Task Areas |
|
|
Meeting Summaries |
|
|