Difference between revisions of "Past Ticket Bulletins 2016"

From GridPP Wiki
Jump to: navigation, search
Line 1: Line 1:
 +
 +
'''Monday 8th February 2016, 13.30 GMT'''<br />
 +
<strike>44</strike> 43 Open UK Tickets this month. Going over all of them, in kinda-alphabetical order.
 +
 +
'''NGI'''<br />
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=118930 118930] (18/1)<br />
 +
That NGI information ticket, linked to the "wrong" (according to some) information being published by the UK arc CEs. This has haunted us for a while, the consensus was the ticket is a load of B-word and not really worth worrying over - but it does warrant a response (from someone over that Steve J).. Assigned (19/1)
 +
 +
'''SUSSEX'''<br />
 +
With Matt RB off to pastures green Sussex is in limbo - I'll contact Jeremy M concerning this last week's fresh tickets.
 +
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=117894 117894] (23/11)<br />
 +
Atlas Consistency Checking. On hold (25/1)
 +
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=118289 118289] (10/12)<br />
 +
Gridpp Pilots. On hold (25/1)
 +
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=118337 118337] (14/12)<br />
 +
The Sussex SE was not working for Sno+ - the most serious of these older issues. On hold (25/1)
 +
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=119383 119383] (5/2)<br />
 +
ROD Availability ticket. Assigned (5/2)
 +
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=119384 119384] (5/2)<br />
 +
ROD CA distribution ticket. Maybe the two ROD tickets are correlated (i.e. if we fix this one the previous one will soothe itself?) Assigned (5/2)
 +
 +
'''RALPP'''<br />
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=118945 118945] (19/1)<br />
 +
Poor CMS SAM results for RALPP due to digi-reco work pummeling the RALPP storage - Chris has asked for the digi-reco workload to stop at RALPP, then asked for clarification as to why the site was still in unknown state. Waiting for reply (25/1) ''Solved - it was them, not RALPP - a restart of the SAM services looks to have cleared the issue,''
 +
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=118628 118628] (5/1)<br />
 +
LZ Pilot deployment at RALPP. Chris has submitted a bug report to nordugrid to fix the issue (http://bugzilla.nordugrid.org/show_bug.cgi?id=3529), which was fixed and should be available in the next release. On Hold (26/1) ''Update - Chris is trying to get hold of a pre-release to test things.''
 +
 +
'''OXFORD'''<br />
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=119197 119197] (29/1)<br />
 +
CMS has asked to change some CRAB site configs at T3s - Daniela has ashed Chris B if he's the one looking after this for Oxford. Assigned (3/2)
 +
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=117892 117892] (23/11)<br />
 +
Atlas consistency checks. Ewan has firmly and clearly put this on the backburner. On hold (12/1)
 +
 +
'''BIRMINGHAM'''<br />
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=118155 118155] (4/12)<br />
 +
Biomed having a clear up of their stuff on the Brummie SE. Franck has given the nod for deleting the dark data left in the DPM after their cleanup efforts. It's on their heads now! In progress (2/2)
 +
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=117890 117890] (23/11)<br />
 +
Another Atlas Storage Consistency Checking ticket. Any chance to have a look at this again? On hold (15/12)
 +
 +
'''GLASGOW'''<br />
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=117706 117706] (19/11)<br />
 +
Another pilot ticket, this time for pheno. Glasgow were going to roll this into their overhaul of their identity management gubbins, but the Universe messed with their plans. How goes things? On hold (15/1)
 +
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=118052 118052] (30/11)<br />
 +
HTTP support on the Glasgow SE. I suspect progress here took a similar shoeing to the identity management plan - but the ticket could do with an update (and maybe on holding). In Progress (4/1)
 +
 +
'''ECDF'''<br />
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=118787 118787] (12/1)<br />
 +
Another HTTP ticket. Let us know if you need a hand Marcus and Andy. Or if you're too busy to make this a priority consider on-holding it. In progress (12/1)
 +
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=95303 95303] (1/7)<br />
 +
Tarball glexec ticket. On hold for a very long time.
 +
 +
''An update on this - I managed to put in some good hours on trying to build a relocatable glexec last week, successfully building from source glexec and the lcas/lcmaps stack. *But* I still have rpath problems - short of attacking every lib file with patchelf I'm not sure how to proceed, and the process is such a mess that I'm not sure if I'll ever manage to make it into a proper recipe (much like my cocoa-butter shortbread).''
 +
 +
'''SHEFFIELD'''<br />
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=119374 119374] (5/2)<br />
 +
A fresh ticket from Biomed, about incorrect/no dynamic information being published at Sheffield. In progress (5/2) ''Update - see Steve B's post to TB-SUPPORT for clues, Elena is retackling these problems today.''
 +
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=118789 118789] (12/1)<br />
 +
ROD Information system ticket, almost certainly caused by the same underlying issue. Is the bdii service on your CEs silently dying or failing to update?
 +
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=114460 114460] (18/6)<br />
 +
Gridpp Pilots. Changes were implemented but at last check things weren't working right. How goes it now? In progress (20/1)
 +
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=117886 117886] (23/11)<br />
 +
Atlas Storage Consistency Check ticket - any luck with this? On hold (29/1)
 +
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=118764 118764] (12/1)<br />
 +
HTTP support ticket for the Sheffield SE. Have you had a chance to have a look at this? In progress (25/1)
 +
 +
''The Storage list can lend a hand fixing either of these issues (which goes for everyone of course).''
 +
 +
'''MANCHESTER'''<br />
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=118679 118679] (7/1)<br />
 +
HTTP support (atlas edition). Hit a problem due to there being no outside-a-space-token space at Manchester. On Hold (12/1)
 +
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=118674 118674] (7/1)<br />
 +
HTTP Support (lhcb edition). As above. On Hold (12/1)
 +
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=117885 117885] (23/11)<br />
 +
Atlas Storage Consistency Checks - hit the same problem as the previous 2 tickets. On hold (10/1)
 +
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=118603 118603] (4/1)<br />
 +
A VOMS ticket rather then a site ticket, removal of the nsccs.ac.uk VO. The VO has been removed from the other UK voms servers. In progress (5/2) ''Update-solved''
 +
 +
'''LANCASTER'''<br />
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=95299 95299] (1/7)<br />
 +
Lancaster's glexec tarball ticket. See the entry above - although I really need to update the ticket properly! Practice what you preach, Matt! On hold.
 +
 +
'''RHUL'''<br />
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=119380 119380] (5/2)<br />
 +
ROD Low availability ticket - the site is in the green now, so it's the usual 30-day wait. On hold (8/2)
 +
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=117881 117881] (23/11)<br />
 +
Atlas SCC ticket. On hold until March. On hold (1/2)
 +
 +
'''QMUL'''<br />
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=117723  117723] (19/11)<br />
 +
Pilots at QM. Dan's been working on this, and asked Daniela for a picture of what should be enabled[1] - Any joy? In progress (27/1)
 +
 +
[1] http://www.hep.ph.ic.ac.uk/~dbauer/dirac/site_pilot_status.html
 +
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=117880 117880] (23/11)<br />
 +
Atlas SCC ticket (wish I had started using that acronym sooner). Just waiting for the nod from atlas that all is well. Dan included the script he uses that may be useful for other STORM sites. Waiting for reply (4/2)
 +
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=118985 118985] (21/1)<br />
 +
QM has banished biomed from their queues until QM have a cgroupy solution to the ill-behaved biomed user jobs. Biomed have asked that the ban be reconsidered and problem users by dealt with by the VO. QM are perfectly right to say no to this, but it'll be nice to not leave them hanging. On hold (1/2)
 +
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=119348 119348] (4/2)<br />
 +
LHCB have noticed cvmfs issues on some nodes, which Dan couldn't replicate. Dan ponders that perhaps this is caused by ephemeral memory issues on the nodes, noting more swap being used recently. Waiting for reply (4/2)
 +
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=119409 119409] (8/2)<br />
 +
Fresh ROD emi glexec ticket - things exploded at the weekend but the QM admins are fighting the good fight. In progress (8/2)
 +
 +
'''IMPERIAL'''<br />
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=119294 119294] - but this got solved by the times I got to it (it concerned a java update breaking md5).
 +
 +
'''BRUNEL'''<br />
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=117878 117878] (23/11)<br />
 +
Atlas SCC - Raul provided an example and is waiting on atlas to give a yay or nay before deploying. Waiting for reply (18/1)
 +
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=118740 118740] (10/1)<br />
 +
Atlas MCORE problems at Brunel, looks to be caused by some extreme Condor oddness, Raul reconfigured Condor to give a better view. Any joy? In progress (25/1)
 +
 +
'''100IT'''<br />
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=119002 119002] (Reopened)<br />
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=116358 116358] (In Progress)<br />
 +
Not going into detail with these as I'm not sure what the crack is with 100IT.
 +
 +
'''AND FINALLY...'''
 +
 +
'''THE TIER 1'''<br />
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=118809 118809] (12/1)<br />
 +
The Tier 1 provided feedback on configuring memory limits for batch jobs, the ticket left open for follow up. On hold (13/1)
 +
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=116864 116864] (12/10)<br />
 +
CMS AAA tests failing. Andrew L reports that the CASTOR headnode has received what sounds like a big fix which will hopefully improve things. In progress (29/1)
 +
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=119389 119389] (5/2)<br />
 +
LHCB data transfer problem to RAL. Being looked at. In progress (5/2)
 +
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=117683 117683] (18/11)<br />
 +
Another publishing ticket. How we love those! This one about CASTOR not publishing GLUE 2. Code was written by Jens and Rob but not integrated, something that works might be a long way off. That was a month ago, any news since? In progress (5/2)
 +
 +
[https://ggus.eu/?mode=ticket_info&ticket_id=109358 109358] (15/10) or (5/2)<br />
 +
This ticket is weird - it started in a "waiting for reply" state and was apparently issued in 2014! I can't find a ticket with this number in my records though.
 +
Sno+ are unable to use the RAL WMS - it's being looked at. In progress (5/2)
 +
 +
 
'''Monday 1st February 2016, 14.30 GMT'''<br />
 
'''Monday 1st February 2016, 14.30 GMT'''<br />
 
50 Open UK Tickets this week, no Ops meeting scheduled so postponing a full review.
 
50 Open UK Tickets this week, no Ops meeting scheduled so postponing a full review.

Revision as of 13:36, 15 February 2016

Monday 8th February 2016, 13.30 GMT
44 43 Open UK Tickets this month. Going over all of them, in kinda-alphabetical order.

NGI
118930 (18/1)
That NGI information ticket, linked to the "wrong" (according to some) information being published by the UK arc CEs. This has haunted us for a while, the consensus was the ticket is a load of B-word and not really worth worrying over - but it does warrant a response (from someone over that Steve J).. Assigned (19/1)

SUSSEX
With Matt RB off to pastures green Sussex is in limbo - I'll contact Jeremy M concerning this last week's fresh tickets.

117894 (23/11)
Atlas Consistency Checking. On hold (25/1)

118289 (10/12)
Gridpp Pilots. On hold (25/1)

118337 (14/12)
The Sussex SE was not working for Sno+ - the most serious of these older issues. On hold (25/1)

119383 (5/2)
ROD Availability ticket. Assigned (5/2)

119384 (5/2)
ROD CA distribution ticket. Maybe the two ROD tickets are correlated (i.e. if we fix this one the previous one will soothe itself?) Assigned (5/2)

RALPP
118945 (19/1)
Poor CMS SAM results for RALPP due to digi-reco work pummeling the RALPP storage - Chris has asked for the digi-reco workload to stop at RALPP, then asked for clarification as to why the site was still in unknown state. Waiting for reply (25/1) Solved - it was them, not RALPP - a restart of the SAM services looks to have cleared the issue,

118628 (5/1)
LZ Pilot deployment at RALPP. Chris has submitted a bug report to nordugrid to fix the issue (http://bugzilla.nordugrid.org/show_bug.cgi?id=3529), which was fixed and should be available in the next release. On Hold (26/1) Update - Chris is trying to get hold of a pre-release to test things.

OXFORD
119197 (29/1)
CMS has asked to change some CRAB site configs at T3s - Daniela has ashed Chris B if he's the one looking after this for Oxford. Assigned (3/2)

117892 (23/11)
Atlas consistency checks. Ewan has firmly and clearly put this on the backburner. On hold (12/1)

BIRMINGHAM
118155 (4/12)
Biomed having a clear up of their stuff on the Brummie SE. Franck has given the nod for deleting the dark data left in the DPM after their cleanup efforts. It's on their heads now! In progress (2/2)

117890 (23/11)
Another Atlas Storage Consistency Checking ticket. Any chance to have a look at this again? On hold (15/12)

GLASGOW
117706 (19/11)
Another pilot ticket, this time for pheno. Glasgow were going to roll this into their overhaul of their identity management gubbins, but the Universe messed with their plans. How goes things? On hold (15/1)

118052 (30/11)
HTTP support on the Glasgow SE. I suspect progress here took a similar shoeing to the identity management plan - but the ticket could do with an update (and maybe on holding). In Progress (4/1)

ECDF
118787 (12/1)
Another HTTP ticket. Let us know if you need a hand Marcus and Andy. Or if you're too busy to make this a priority consider on-holding it. In progress (12/1)

95303 (1/7)
Tarball glexec ticket. On hold for a very long time.

An update on this - I managed to put in some good hours on trying to build a relocatable glexec last week, successfully building from source glexec and the lcas/lcmaps stack. *But* I still have rpath problems - short of attacking every lib file with patchelf I'm not sure how to proceed, and the process is such a mess that I'm not sure if I'll ever manage to make it into a proper recipe (much like my cocoa-butter shortbread).

SHEFFIELD
119374 (5/2)
A fresh ticket from Biomed, about incorrect/no dynamic information being published at Sheffield. In progress (5/2) Update - see Steve B's post to TB-SUPPORT for clues, Elena is retackling these problems today.

118789 (12/1)
ROD Information system ticket, almost certainly caused by the same underlying issue. Is the bdii service on your CEs silently dying or failing to update?

114460 (18/6)
Gridpp Pilots. Changes were implemented but at last check things weren't working right. How goes it now? In progress (20/1)

117886 (23/11)
Atlas Storage Consistency Check ticket - any luck with this? On hold (29/1)

118764 (12/1)
HTTP support ticket for the Sheffield SE. Have you had a chance to have a look at this? In progress (25/1)

The Storage list can lend a hand fixing either of these issues (which goes for everyone of course).

MANCHESTER
118679 (7/1)
HTTP support (atlas edition). Hit a problem due to there being no outside-a-space-token space at Manchester. On Hold (12/1)

118674 (7/1)
HTTP Support (lhcb edition). As above. On Hold (12/1)

117885 (23/11)
Atlas Storage Consistency Checks - hit the same problem as the previous 2 tickets. On hold (10/1)

118603 (4/1)
A VOMS ticket rather then a site ticket, removal of the nsccs.ac.uk VO. The VO has been removed from the other UK voms servers. In progress (5/2) Update-solved

LANCASTER
95299 (1/7)
Lancaster's glexec tarball ticket. See the entry above - although I really need to update the ticket properly! Practice what you preach, Matt! On hold.

RHUL
119380 (5/2)
ROD Low availability ticket - the site is in the green now, so it's the usual 30-day wait. On hold (8/2)

117881 (23/11)
Atlas SCC ticket. On hold until March. On hold (1/2)

QMUL
117723 (19/11)
Pilots at QM. Dan's been working on this, and asked Daniela for a picture of what should be enabled[1] - Any joy? In progress (27/1)

[1] http://www.hep.ph.ic.ac.uk/~dbauer/dirac/site_pilot_status.html

117880 (23/11)
Atlas SCC ticket (wish I had started using that acronym sooner). Just waiting for the nod from atlas that all is well. Dan included the script he uses that may be useful for other STORM sites. Waiting for reply (4/2)

118985 (21/1)
QM has banished biomed from their queues until QM have a cgroupy solution to the ill-behaved biomed user jobs. Biomed have asked that the ban be reconsidered and problem users by dealt with by the VO. QM are perfectly right to say no to this, but it'll be nice to not leave them hanging. On hold (1/2)

119348 (4/2)
LHCB have noticed cvmfs issues on some nodes, which Dan couldn't replicate. Dan ponders that perhaps this is caused by ephemeral memory issues on the nodes, noting more swap being used recently. Waiting for reply (4/2)

119409 (8/2)
Fresh ROD emi glexec ticket - things exploded at the weekend but the QM admins are fighting the good fight. In progress (8/2)

IMPERIAL
119294 - but this got solved by the times I got to it (it concerned a java update breaking md5).

BRUNEL
117878 (23/11)
Atlas SCC - Raul provided an example and is waiting on atlas to give a yay or nay before deploying. Waiting for reply (18/1)

118740 (10/1)
Atlas MCORE problems at Brunel, looks to be caused by some extreme Condor oddness, Raul reconfigured Condor to give a better view. Any joy? In progress (25/1)

100IT
119002 (Reopened)
116358 (In Progress)
Not going into detail with these as I'm not sure what the crack is with 100IT.

AND FINALLY...

THE TIER 1
118809 (12/1)
The Tier 1 provided feedback on configuring memory limits for batch jobs, the ticket left open for follow up. On hold (13/1)

116864 (12/10)
CMS AAA tests failing. Andrew L reports that the CASTOR headnode has received what sounds like a big fix which will hopefully improve things. In progress (29/1)

119389 (5/2)
LHCB data transfer problem to RAL. Being looked at. In progress (5/2)

117683 (18/11)
Another publishing ticket. How we love those! This one about CASTOR not publishing GLUE 2. Code was written by Jens and Rob but not integrated, something that works might be a long way off. That was a month ago, any news since? In progress (5/2)

109358 (15/10) or (5/2)
This ticket is weird - it started in a "waiting for reply" state and was apparently issued in 2014! I can't find a ticket with this number in my records though. Sno+ are unable to use the RAL WMS - it's being looked at. In progress (5/2)


Monday 1st February 2016, 14.30 GMT
50 Open UK Tickets this week, no Ops meeting scheduled so postponing a full review.

org.bdii.GLUE2-Validate tickets
We have 8 sites with these tickets (7 as Bristol have slain theirs), these are being discussed on TB-SUPPORT. A lot of these are still just assigned though - even if the issue is not really our fault we still need to handle the ticket proper. Rising above it all and all that.

If someone has submitted or knows of a counter-ticket for this issue please let me know.

NGI
Talking about a pain in the Information System, the UK still has this ticket to close (which has a similar root problem): 118930

CMS Siteconf problems.
GLASGOW 119196
EDINBURGH 119195
OXFORD 119197

CMS have spotted a number of misconfigured T3s across the globe (on a Friday afternoon)- the fix seems to be straightforward enough and Glasgow look like they're done already. Proper job!

ATLAS CONSISTENCY CHECKS
We still have 8 tickets open on this issue, although a couple are waiting for feedback from atlas. I'll bring this up in the Thursday UK atlas meeting to see if we can't shimmy along the tickets waiting for atlas feedback.

PILOTS
117723
Whilst investigating pilot issues at QM Daniela reminds us of this page that tells us what Dirac things should be going on at your site. Might be handy to preempt problems:
http://www.hep.ph.ic.ac.uk/~dbauer/dirac/site_pilot_status.html

118628
Whilst rolling out similar changes for LZ at RALPP Chris stumbled upon a problem, for which he submitted a bug report to nordugrid: http://bugzilla.nordugrid.org/show_bug.cgi?id=3529

AND FINALLY

QMUL
118985 (21/1)
Biomed have got back to Dan suggesting that rather then ban them altogether until he has a cgroup-corral to put their jobs in if he would be willing and able to supply a list of the problem users. Of course this requires that there be any non-problem users in the VO... On hold (1/2)

Monday 25th January 2016, 14.30 GMT

"OTHER VO" NAGIOS
Looks like hepgrid2.ph.liv.ac.uk at Liverpool is playing up for all VOs, and the Sheffield SE is misbehaving for the gripp VO. Other then that it looks clean.

43 Open UK Tickets this week.

That ticket to the NGI...
118930 (18/1)
Steve J put in a comprehensive reply about what Liverpool do to get their publishing kinda right. The view on this ticket from last week was to close it with a <carefully|harshly> worded statement about why this is a bit of a pointless request. Who was formulating the reply? If it was me I dropped that ball! Assigned (19/1)

Pilots Problems.
BRUNEL: 117710 Pheno. On Hold (19/11/15)
QMUL: 117723 Pheno - hopefully sorted. Waiting for reply (25/1)
SHEFFIELD: 114460 gridpp et al. In Progress (20/1)
RALPP: 118628 LZ (and maybe LSST?). In progress (14/1)

We have a few pilot rollout tickets, the last two being worked on but proving problematic.

RHUL
119027 (22/1)
As seen on the gridpp-storage list, Sno+ have asked RHUL (and will no doubt as others) for storage space (~20TB). In progress (22/1)

(for the interest of others the Govind's other thread on gridpp-storage was likely triggered by https://ggus.eu/?mode=ticket_info&ticket_id=118553)

QMUL
118985 (21/1)
QM have banished biomed from their cluster until they have a batch system that can put Biomed jobs in a c-group cage (looking at slurm). On Hold (21/1)

BIRMINGHAM
118155 (4/12)
Talking of Biomed, they've asked if they've successfully cleaned up all their files on the Birmingham SE - a cheeky uberftp onto your SE suggests the biomed directory is still full of cra.. I mean, files. In Progress (20/1)

HTTP TF Tickets
118787 (ECDF)
118764 (SHEFFIELD)
Feel free to poke the gridpp storage group for help with these. (I left out the 2 Manchester tickets as their immediate showstopper isn't their configs- but they can ask for help too!).

ATLAS CONSISTENCY CHECKS
Manchester, Oxford, Birmingham, Sussex, RHUL, Sheffield, Brunel and QMUL still open - a mix of chugging along nicely and being very much "On Hold".

Monday 18th January 2016, 14.00 GMT
49(!!) Open UK Tickets this week

NGI
118930 (18/1)
The NGI received a ticket concerning incorrect or missing glue information for the Tier 1, Brunel, Imperial, Liverpool, Durham, Glasgow, Bristol, Oxford and RALPP. The variables in question are GlueSubClusterPhysicalCPUs, GlueSubClusterLogicalCPUs and GlueHostProcessorOtherDescription. There are some extra instructions in the ticket - it would be nice if we didn't have to create child tickets (hint hint...).

ATLAS CONSISTENCY CHECKS (10 tickets)
Progress, or at least non-exciting but reassuring updates, on these. Birmingham and Glasgow tickets could do with an update (even if it's a "nothing to see here").

The QMUL ticket had an update providing feedback that might be useful to others too:
https://ggus.eu/?mode=ticket_info&ticket_id=117880

HTTP TF (5 tickets)
ECDF, Manchester, Sheffield and Glasgow are on the HTTP TF list - although no tickets are stale at the moment.

TIER 1 RECOMMENDATIONS
118809 (12/1) An interesting ticket asking T0 and T1s to fill in a questionnaire on configuring batch job memory limits - the Tier 1 have did their bit and the ticket is On Holded for feedback.

GLASGOW
118732 (9/1)
This ticket has got confusing - atlas want a dump for files "lost" at Glasgow that by the looks of it actually never made it to the site in the first place... Waiting for reply (15/1)

TIER 1 DUPLICATES
Are these three CMS are the same (or similar or related) issues -or am I just getting my wires crossed?
118494 (23/12/15)
116864 (12/10/15)
118722 (8/1)

CAN BE CLOSED (I THINK)
IC - 118162 (lfc ticket)
QM - 118839 (atlas job mcore jobs failures - doesn't look like the problem persists).

NEARLY THERE:
Lancaster - 118637 (squid misconfiguration hammering statum-0)
Birmingham - 118155 (biomed SE use - biomed now think they deleted all data at Birmingham).

Monday 11th January 2016, 14.30 GMT
48(!) Open UK Tickets this week

  • VOMS TWEAK

118603: nsccs.ac.uk has been requested to be removed from the gridpp voms servers. Just "Assigned" to the UK as a whole at the moment.

  • THE HTTP TASK FORCE STRIKES

Lancaster, RHUL and Manchester all had http TF tickets alongside Glasgow. Your site might be next! It'll be worth checking the monitoring pages and reviewing the documentation if you are: atlas: http://cern.ch/go/h8Rr
lhcb: http://cern.ch/go/Bk8J
https://twiki.cern.ch/twiki/bin/view/LCG/HTTPTFSAMProbe

  • TRANSFER ODDITIES

118494: The Tier-1 have a CMS ticket where xrootd is expecting a file which phedex and DAS don't think is at RAL. Is this even a site problem?

118728: In a similar vein, QMUL have an atlas ticket where a single file is refusing to be transfered - Dan has noticed a number of write attempts followed by immediate deletion. Checksumming causing a problem?

  • LOW HANGING FRUIT- tickets that can probably be closed, or are close to it.

IMPERIAL 118162
A ticket for the Imperial LFC, which appeared to be working (for Janusz at least).

RALPP 117740
Atlas datadisk cleanup ticket. Elena confirmed that the step09 directory can go for the chop. Not sure if Brian has had a chance at looking at the users directory contents yet.

BRISTOL 118311
I suspect that this CMS SAM ticket can be closed as the CEs were all green.

  • ATLAS CONSISTENCY CHECKS

As requested at the Thursday atlas meeting here's the outstanding consistency check tickets.

IMPERIAL: 117879
Not much news, (understandably) low priority for the site.

SUSSEX: 117894
It doesn't look like Matt got round to this before he left.

SHEFFIELD: 117886
Set in progress but no news since.

OXFORD: 117892
A similar case here - I assume it's on Ewan's to-do list before he heads off to pasture's green.

BIRMINGHAM: 117890
Matt was going to look at this again in the New Year. Any joy?

RHUL: 117881
Govind was going to try to get to this before Christmas. Any luck?

GLASGOW: 117889
Back in 2015 the dumps were run and Sam asked for some clarification. Considering Glasgow's current state any dump made using these tools might be full of lies, but I know that you chaps are working on this problem.

BRUNEL 117878
Raul asked some questions in his ticket, for which atlas only replied last week.

QMUL: 117880
Dan has created dumps and has asked for the all clear before he sets up the monthly cron.

TIER 1: 117846
Dumps have been created, but gfal and castor issues have slowed down the checking process (gfal-cat doesn't seem to work with castor).

MANCHESTER: 117885
This ticket was recently On-Holded, as currently Manchester has 0 free space outside of tokens whilst a few disk servers are down.

Monday 4th January 2015, 14.30 GMT
HAPPY NEW YEAR EVERYONE!

38 Open UK Tickets this year.

All-the-UK-tickets URL: http://tinyurl.com/nwgrnys

As Jeremy spotted, with Matt RB off to pastures new the Sussex tickets are looking a bit neglected, especially as one was reopened after his departure:
118337
118289

Finally in this Glasgow ticket the submitter gave two new links for the http taskforce monitoring: 118052

The links to the http tf monitoring pages are:
atlas: http://cern.ch/go/h8Rr
lhcb: http://cern.ch/go/Bk8J