Difference between revisions of "Past Ticket Bulletins 2018"

From GridPP Wiki
Jump to: navigation, search
Line 1: Line 1:
 +
'''Monday 12th February 2018, 17.00 GMT'''<br />
 +
46 Open UK Tickets this week.
 +
 +
[http://tinyurl.com/nwgrnys Link to all the UK Tickets].
 +
 +
It doesn't feel like a very exciting week for tickets - although it's worth noting that Sno+ seem to be having a ticket drive, cleaning up problems that they're seeing.
 +
 +
There's a RHUL ticket ([https://ggus.eu/?mode=ticket_info&ticket_id=133409 133409]) that needs acknowledging, and there's a few tickets from CMS regarding that data transfers that just seem confusing to me ([https://ggus.eu/?mode=ticket_info&ticket_id=133390 133390] and [https://ggus.eu/?mode=ticket_info&ticket_id=133389 133389] at RALPP, [https://ggus.eu/?mode=ticket_info&ticket_id=133344 133344] at Imperial) - although sites aren't to blame for this confusion!
 +
 +
Completely anecdotally (citing [https://ggus.eu/?mode=ticket_info&ticket_id=133424 133424]), is it me or does CVMFS feel less robust recently? It of course could just be me.
 +
 +
Finally I'll take this opportunity to do my bi-annual reminder to sites to please check the status of their tickets - when you start working on it please make sure to set them 'In Progress', when you ask a question please mark the ticket 'Waiting for reply' and when you're not going to make any progress for a while please set the tickets 'On Hold'. Finally finally, it's not really worth leaving tickets for too long before closing them - a day or two is usually more then enough.
 +
 
'''Monday 5th February 2018, 15.30 GMT'''<br />
 
'''Monday 5th February 2018, 15.30 GMT'''<br />
 
38 Open UK Tickets this month
 
38 Open UK Tickets this month

Revision as of 12:51, 19 February 2018

Monday 12th February 2018, 17.00 GMT
46 Open UK Tickets this week.

Link to all the UK Tickets.

It doesn't feel like a very exciting week for tickets - although it's worth noting that Sno+ seem to be having a ticket drive, cleaning up problems that they're seeing.

There's a RHUL ticket (133409) that needs acknowledging, and there's a few tickets from CMS regarding that data transfers that just seem confusing to me (133390 and 133389 at RALPP, 133344 at Imperial) - although sites aren't to blame for this confusion!

Completely anecdotally (citing 133424), is it me or does CVMFS feel less robust recently? It of course could just be me.

Finally I'll take this opportunity to do my bi-annual reminder to sites to please check the status of their tickets - when you start working on it please make sure to set them 'In Progress', when you ask a question please mark the ticket 'Waiting for reply' and when you're not going to make any progress for a while please set the tickets 'On Hold'. Finally finally, it's not really worth leaving tickets for too long before closing them - a day or two is usually more then enough.

Monday 5th February 2018, 15.30 GMT
38 Open UK Tickets this month

IPv6 Tickets
Sussex: 131617 On Hold (15/11/17)
RALPP: 131616 Chris put in a nice update a fortnight ago, citing some perfsonar problems. In progress (31/1)
Oxford: 131615 No recent news on the ticket but I think there's v6 progress at Oxford? On hold (7/11/17)
Cambridge: 131614 On hold (15/11/17)
Bristol: 131613 Early February was the estimated time to get the perfsonar boxes dual stacked, how's that looking? On hold (7/11)
Birmingham: 131612 Duncan poked the ticket last month. On hold (11/11/17)
Glasgow: 131611 I think any further news awaits you chaps moving into your new digs (once they're built). On hold (6/11)
ECDF: 131610 Planning is underway, Raul has kindly offered to help. In progress (5/2)
Durham: 131609 The v6 reverse DNS at Durham is still not working, Adam has provided an update on this. In progress (31/1)
Sheffield: 131608 Is there anyway we can help encourage the University to enable v6 for you? On hold (6/11/17)
Manchester: 131607 Duncan reckons you now have v6 reverse DNS lookup, so that's good news. On hold (1/2)
Liverpool: 131606 As further progress here is reliant on some upstream routers getting upgraded maybe this ticket should be put on hold? In progress (14/11/17)
Lancaster: 131605 Lancaster is just waiting on some testing from a v6 only endpoint. I'm working on setting up a v6 only UI to see if that helps. In progress (5/2)
UCL: 131604 Waiting on central IT to get back. On hold (15/1)
RHUL: 131603 RHUL's perfsonar boxen are now dualstacked - nice. On hold (31/1)

Regular Tickets:

SUSSEX
122772 (11/7/16)
Atlas xroot/webdav ticket. At last word just before Christmas Leo was waiting on some ports being opened up in the external firewall. Any joy? In progress (19/12/17)

RALPP
133250 (5/2/1042)
A ROD ticket - the date looks a bit suspect (I don't think GGUS has been around for that long). The test (ch.cern.WebDAV) and the server failing it (mover.pp.rl.ac.uk) all sound a bit weird too. Assigned (2/2/2018)

133274 (5/2)
CMS xroot failures. Things were fixed by a trusty restart script, but Chris has asked about the state of the AAA network. Waiting for reply (5/2)

OXFORD
133215 (31/1)
Atlas deletion errors on the newly reinstalled Oxford SE. After consulting on the dpm list Kashif tweaked his mysql settings and is in the "wait and see" phase. In progress (5/2)

BRISTOL
133220 (1/2)
CMS hammercloud jobs hitting their wall clock limit - for reason for which is proving a bit of a mystery. Luke has looked into this very closely so far, but it might be some weird emergent property. In progress (2/2)

BIRMINGHAM
132569 (19/12/17)
Dirac pilots not being able to be submitted to Birmingham. I think the problem is well understood, have the effected VOs been removed from the bdii? Assigned (22/1)

129930 (4/8/17)
Atlas http tests failing at Birmingham. Perhaps Kashif might have some insight into this after his recent DPM adventure? Although maybe this ticket will become moot. On hold (16/11/17)

GLASGOW
133115 (29/1)
Checking if the new lchb conddb cvmfs mount is mounted. For some odd reason some of Glasgow CEs are failing/not running the tests. Despite all the tests running across the same WNs. In progress (5/2) Update- LHCB seem to think this is a problem with the tests, and so the ticket can be closed.

ECDF
133222 (5/2/3164)
A ROD ticket from the distant future! The tests look okay now, so I suspect this ticket can be closed. Waiting for reply (5/2/2018)

SHEFFIELD
133019 (24/1)
Low availability ticket, all good. On hold (30/1)

133260 (3/2)
Atlas transfers failing. Any luck debugging this? In progress (3/2)

MANCHESTER
131526 (1/11/17)
Storage accounting deployment. Were there some roadblocks for this? On hold (12/1)

LIVERPOOL
133114 (29/1)
New LHCB mountpoint ticket. It looks like this ticket was missed. Assigned (29/1)

RHUL
132715 (4/1)
Supporting hyperk.org. Any word on this? In progress (22/1)

QMUL
132713 (4/1)
Support for hyperk.org. Sadly despite some fixing errors persist. In progress (5/2)

132929 (18/1)
CMS APEL problem for QM jobs. Due to a problem with SLURM, Dan originally "unsolved" this ticket. Reopened with some useful tips, but the apel team has been involved to check on this, which was the right call. In progress (29/1)

BRUNEL
132876 (16/1)
CMS seeing reading issues at Brunel. After some expert debugging from Raul I think we're waiting on the CERN ticket 133010. In progress (5/2)

IMPERIAL (kinda)
132688 (3/1)
A lost pheno files ticket that bounced back to IC. Just waiting for word back from users (which may take a while). In progress (25/1)

TIER 1
132589 (21/12/17)
Killed LHCB pilots at the Tier 1. There's a proposal to mark the ticket "unsolved", but Vladimir seems reluctant to do this. In progress (31/1)

117683 (18/11/15)
The old Glue 2 publishing for Castor ticket. Last news is that a prototype version is in testing. On hold (3/1)

127597 (4/7/17)
CMS ticket checking xroot and network performance. Chris provided a good news update - new firewall hardware is on its way. However this might not fix things, Chris warns more work might be needed. On hold (29/1)

124876 (7/11/16)
Echo failing gridftp nagios tests - due to the tests being broken. Absolutely no movement on the linked ticket to fix the tests (125026). On hold (13/11/17)

132708 (4/1)
The ticket tracking the decommissioning for the RAL WMSseses. It's going well. In progress (18/1)

Monday 29th January 2018, 15.30 GMT
43 Open UK Tickets this week.

New LHCB mountpoint tickets
LHCB have ticketed a bunch of sites to make sure that they have "/cvmfs/lhcb-condb.cern.ch" accessible on their WNs. It's a simple case of check and close, LHCB will do the verification their end afterwards.

BIRMINGHAM
132569 (19/12/17)
I'm not sure if some solid actions were planned out that week for this ticket, but it could do with an update. I think the decision was simply to remove the dirac supported VOs from the CREAM CE bdii? Assigned (should be a different status) (22/1)

BRUNEL
132876 (16/1)
I'm not sure what's going on in this CMS xroot ticket, but I'm wondering if the original issue either still exists or was not a Brunel problem after all. This ticket either can be closed, or perhaps put on hold whilst the related CERN ticket is sorted. In progress (23/1)

ECDF
132446 (11/12/17)
It looks like this ticket tracking dirac jobs having batch system problems can be closed after so tweaking in the argus servers. In progress (26/1)

Also I think the corresponding hyperk support ticket 132716 can be closed too.

RHUL
132715 (4/1)
It might well be that you're still in the middle of network maintenance, but a polite reminder of this hyperk support ticket. In progress (22/1)

TIER 1
132712 (4/1)
Still on the hyperk support ticket, this ticket was just waiting on the hyperk configs to get into quattor. Has that happened yet? In progress (23/1) Update - solved

132589 (21/12/17)
Raja has updated the ticket to sadly report that they are still seeing LHCB job deaths at RAL. In progress (29/1) A further update this morning from Vladimir asks to check on a bunch of jobs' statuses.

132708 (4/1)
Just for information, this is the ticket tracking the decommissioning of the RAL WMSses. In progress (18/1)

Monday 22nd January 2018, 15.00 GMT
54 Open UK Tickets this year.

Start with the good news - these tickets look like they can be closed:

BRISTOL
132880 (16/1)
It looks like transfers are working after the firewall fix. In progress (19/1) Solved, but CMS have hit Bristol with another xroot ticket: 132990

QMUL
132615 (26/12/17)
After changing the working directory LHCB jobs don't seem to be running out of space anymore, so the ticket can be closed. In progress (20/1)

TIER 1
132712 (4/1)
There seems to be positive news getting hyperK jobs working at the Tier 1, so maybe this ticket is sorted? In progress (22/1)

RALPP
132830 (12/1)
This complex CMS xroot ticket looks likely to be solved (in fact Chris might be closing the ticket as I type). In progress (19/1) Solved

Now onto the bad:

RHUL
132715 (4/1)
This ticket from Daniela about supporting the hyperK VO seems to have gone un-noticed. Can you please notice it? Assigned (4/1)

RALPP
132851 (15/1)
This CMS xroot ticket might be related to the one above, hence why it's not been tended to (indeed it might be able to be closed too). There's a request for some verbose output of an xrdcp from different CMS peeps, so the conversation is out of the site's hands for now. In progress (17/1)

QMUL
132713 (4/1)
Fixing hyperk jobs at QM on a couple of CEs. Dan had a kick of things a while back, how did that work out? In progress (4/1)

BIRMINGHAM
132569 (19/12)
Daniela spotted Dirac problems at Birmingham. Ultimately this is fallout from the Birmingham move to VAC, Daniela has suggested that Mark remove the VOs from the BDII to stop dirac sending jobs to an almost dead CE. Assigned (should be something else) (22/1)

MANCHESTER
132121 (28/11/17)
Any news or progress with this ticket to the VOMS service? There's been no updates with words in them from any site admins. In progress (1/12/17)

TIER 1
132589 (21/12/17)
LHCB pilots are still failing at the Tier 1 at Raja's last post, this ticket could do with an update from the Tier 1's side. In progress (10/1)

And the Ugly are a few tickets that need updates from the VOs:

MANCHESTER
132468 (14/12/17)
Alessandra updated this atlas transfer ticket with news that she has informed atlas of many lost files that were causing the errors. No news from anyone since. Perhaps someone from cloud support could update things? In progress (4/1)

IMPERIAL
132688 (3/1)
Daniela tried to poke Pheno over some lost files, but has had nothing but silence from them. Must have not been important files. Assigned (19/1)

132692 (3/1)
This LHCB ticket is in the same state as the Pheno one- waiting for someone from the VO to acknowledge the lost files. Assigned (3/1)

132683 (3/1)
The atlas equivalent of the previous two, Brian jumped on it when poked through another channel - so maybe these lines of communication aren't getting to where they should? In progress (22/1)

Extra extra...

Raul pointed out on tb-support this Brunel ticket 132876, which points to an IPv6 config issue and has been thrown back towards the T0 to fix things (132993).