Difference between revisions of "Operations Bulletin Latest"
(→) |
(→) |
||
Line 423: | Line 423: | ||
===== ===== | ===== ===== | ||
<!-- ******************Edit start********************* -----> | <!-- ******************Edit start********************* -----> | ||
+ | '''Monday 8th February 2016, 13.30 GMT'''<br /> | ||
+ | <strike>44</strike> 43 Open UK Tickets this month. Going over all of them, in kinda-alphabetical order. | ||
+ | '''NGI'''<br /> | ||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=118930 118930] (18/1)<br /> | ||
+ | That NGI information ticket, linked to the "wrong" (according to some) information being published by the UK arc CEs. This has haunted us for a while, the consensus was the ticket is a load of B-word and not really worth worrying over - but it does warrant a response (from someone over that Steve J).. Assigned (19/1) | ||
+ | |||
+ | '''SUSSEX'''<br /> | ||
+ | With Matt RB off to pastures green Sussex is in limbo - I'll contact Jeremy M concerning this last week's fresh tickets. | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=117894 117894] (23/11)<br /> | ||
+ | Atlas Consistency Checking. On hold (25/1) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=118289 118289] (10/12)<br /> | ||
+ | Gridpp Pilots. On hold (25/1) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=118337 118337] (14/12)<br /> | ||
+ | The Sussex SE was not working for Sno+ - the most serious of these older issues. On hold (25/1) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=119383 119383] (5/2)<br /> | ||
+ | ROD Availability ticket. Assigned (5/2) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=119384 119384] (5/2)<br /> | ||
+ | ROD CA distribution ticket. Maybe the two ROD tickets are correlated (i.e. if we fix this one the previous one will soothe itself?) Assigned (5/2) | ||
+ | |||
+ | '''RALPP'''<br /> | ||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=118945 118945] (19/1)<br /> | ||
+ | Poor CMS SAM results for RALPP due to digi-reco work pummeling the RALPP storage - Chris has asked for the digi-reco workload to stop at RALPP, then asked for clarification as to why the site was still in unknown state. Waiting for reply (25/1) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=118628 118628] (5/1)<br /> | ||
+ | LZ Pilot deployment at RALPP. Chris has submitted a bug report to nordugrid to fix the issue (http://bugzilla.nordugrid.org/show_bug.cgi?id=3529), which was fixed and should be available in the next release. On Hold (26/1) | ||
+ | |||
+ | '''OXFORD'''<br /> | ||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=119197 119197] (29/1)<br /> | ||
+ | CMS has asked to change some CRAB site configs at T3s - Daniela has ashed Chris B if he's the one looking after this for Oxford. Assigned (3/2) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=117892 117892] (23/11)<br /> | ||
+ | Atlas consistency checks. Ewan has firmly and clearly put this on the backburner. On hold (12/1) | ||
+ | |||
+ | '''BIRMINGHAM'''<br /> | ||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=118155 118155] (4/12)<br /> | ||
+ | Biomed having a clear up of their stuff on the Brummie SE. Franck has given the nod for deleting the dark data left in the DPM after their cleanup efforts. It's on their heads now! In progress (2/2) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=117890 117890] (23/11)<br /> | ||
+ | Another Atlas Storage Consistency Checking ticket. Any chance to have a look at this again? On hold (15/12) | ||
+ | |||
+ | '''GLASGOW'''<br /> | ||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=117706 117706] (19/11)<br /> | ||
+ | Another pilot ticket, this time for pheno. Glasgow were going to roll this into their overhaul of their identity management gubbins, but the Universe messed with their plans. How goes things? On hold (15/1) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=118052 118052] (30/11)<br /> | ||
+ | HTTP support on the Glasgow SE. I suspect progress here took a similar shoeing to the identity management plan - but the ticket could do with an update (and maybe on holding). In Progress (4/1) | ||
+ | |||
+ | '''ECDF'''<br /> | ||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=118787 118787] (12/1)<br /> | ||
+ | Another HTTP ticket. Let us know if you need a hand Marcus and Andy. Or if you're too busy to make this a priority consider on-holding it. In progress (12/1) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=95303 95303] (1/7)<br /> | ||
+ | Tarball glexec ticket. On hold for a very long time. | ||
+ | |||
+ | ''An update on this - I managed to put in some good hours on trying to build a relocatable glexec last week, successfully building from source glexec and the lcas/lcmaps stack. *But* I still have rpath problems - short of attacking every lib file with patchelf I'm not sure how to proceed, and the process is such a mess that I'm not sure if I'll ever manage to make it into a proper recipe (much like my cocoa-butter shortbread).'' | ||
+ | |||
+ | '''SHEFFIELD'''<br /> | ||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=119374 119374] (5/2)<br /> | ||
+ | A fresh ticket from Biomed, about incorrect/no dynamic information being published at Sheffield. In progress (5/2) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=118789 118789] (12/1)<br /> | ||
+ | ROD Information system ticket, almost certainly caused by the same underlying issue. Is the bdii service on your CEs silently dying or failing to update? | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=114460 114460] (18/6)<br /> | ||
+ | Gridpp Pilots. Changes were implemented but at last check things weren't working right. How goes it now? In progress (20/1) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=117886 117886] (23/11)<br /> | ||
+ | Atlas Storage Consistency Check ticket - any luck with this? On hold (29/1) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=118764 118764] (12/1)<br /> | ||
+ | HTTP support ticket for the Sheffield SE. Have you had a chance to have a look at this? In progress (25/1) | ||
+ | |||
+ | ''The Storage list can lend a hand fixing either of these issues (which goes for everyone of course).'' | ||
+ | |||
+ | '''MANCHESTER'''<br /> | ||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=118679 118679] (7/1)<br /> | ||
+ | HTTP support (atlas edition). Hit a problem due to there being no outside-a-space-token space at Manchester. On Hold (12/1) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=118674 118674] (7/1)<br /> | ||
+ | HTTP Support (lhcb edition). As above. On Hold (12/1) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=117885 117885] (23/11)<br /> | ||
+ | Atlas Storage Consistency Checks - hit the same problem as the previous 2 tickets. On hold (10/1) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=118603 118603] (4/1)<br /> | ||
+ | A VOMS ticket rather then a site ticket, removal of the nsccs.ac.uk VO. The VO has been removed from the other UK voms servers. In progress (5/2) | ||
+ | |||
+ | '''LANCASTER'''<br /> | ||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=95299 95299] (1/7)<br /> | ||
+ | Lancaster's glexec tarball ticket. See the entry above - although I really need to update the ticket properly! Practice what you preach, Matt! On hold. | ||
+ | |||
+ | '''RHUL'''<br /> | ||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=119380 119380] (5/2)<br /> | ||
+ | ROD Low availability ticket - the site is in the green now, so it's the usual 30-day wait. On hold (8/2) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=117881 117881] (23/11)<br /> | ||
+ | Atlas SCC ticket. On hold until March. On hold (1/2) | ||
+ | |||
+ | '''QMUL'''<br /> | ||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=117723 117723] (19/11)<br /> | ||
+ | Pilots at QM. Dan's been working on this, and asked Daniela for a picture of what should be enabled[1] - Any joy? In progress (27/1) | ||
+ | |||
+ | [1] http://www.hep.ph.ic.ac.uk/~dbauer/dirac/site_pilot_status.html | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=117880 117880] (23/11)<br /> | ||
+ | Atlas SCC ticket (wish I had started using that acronym sooner). Just waiting for the nod from atlas that all is well. Dan included the script he uses that may be useful for other STORM sites. Waiting for reply (4/2) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=118985 118985] (21/1)<br /> | ||
+ | QM has banished biomed from their queues until QM have a cgroupy solution to the ill-behaved biomed user jobs. Biomed have asked that the ban be reconsidered and problem users by dealt with by the VO. QM are perfectly right to say no to this, but it'll be nice to not leave them hanging. On hold (1/2) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=119348 119348] (4/2)<br /> | ||
+ | LHCB have noticed cvmfs issues on some nodes, which Dan couldn't replicate. Dan ponders that perhaps this is caused by ephemeral memory issues on the nodes, noting more swap being used recently. Waiting for reply (4/2) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=119409 119409] (8/2)<br /> | ||
+ | Fresh ROD emi glexec ticket - things exploded at the weekend but the QM admins are fighting the good fight. In progress (8/2) | ||
+ | |||
+ | '''IMPERIAL'''<br /> | ||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=119294 119294] - but this got solved by the times I got to it (it concerned a java update breaking md5). | ||
+ | |||
+ | '''BRUNEL'''<br /> | ||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=117878 117878] (23/11)<br /> | ||
+ | Atlas SCC - Raul provided an example and is waiting on atlas to give a yay or nay before deploying. Waiting for reply (18/1) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=118740 118740] (10/1)<br /> | ||
+ | Atlas MCORE problems at Brunel, looks to be caused by some extreme Condor oddness, Raul reconfigured Condor to give a better view. Any job? In progress (25/1) | ||
+ | |||
+ | '''100IT'''<br /> | ||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=119002 119002] (Reopened)<br /> | ||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=116358 116358] (In Progress)<br /> | ||
+ | Not going into detail with these as I'm not sure what the crack is with 100IT. | ||
+ | |||
+ | '''AND FINALLY...''' | ||
+ | |||
+ | '''THE TIER 1'''<br /> | ||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=118809 118809] (12/1)<br /> | ||
+ | The Tier 1 provided feedback on configuring memory limits for batch jobs, the ticket left open for follow up. On hold (13/1) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=116864 116864] (12/10)<br /> | ||
+ | CMS AAA tests failing. Andrew L reports that the CASTOR headnode has received what sounds like a big fix which will hopefully improve things. In progress (29/1) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=119389 119389] (5/2)<br /> | ||
+ | LHCB data transfer problem to RAL. Being looked at. In progress (5/2) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=117683 117683] (18/11)<br /> | ||
+ | Another publishing ticket. How we love those! This one about CASTOR not publishing GLUE 2. Code was written by Jens and Rob but not integrated, something that works might be a long way off. That was a month ago, any news since? In progress (5/2) | ||
+ | |||
+ | [https://ggus.eu/?mode=ticket_info&ticket_id=109358 109358] (15/10) or (5/2)<br /> | ||
+ | This ticket is weird - it started in a "waiting for reply" state and was apparently issued in 2014! I can't find a ticket with this number in my records though. | ||
+ | Sno+ are unable to use the RAL WMS - it's being looked at. In progress (5/2) | ||
<!-- ******************Edit stop********************* -----> | <!-- ******************Edit stop********************* -----> |
Revision as of 15:51, 8 February 2016
Week commencing Monday 8th February 2016 |
Task Areas |
|
|
Meeting Summaries |
|
|