Monday 4th August 2014, 14.30 BST
20 Open UK tickets this week.
NGI/Other
107369(30/7)
NGIs are being asked to ask Cloud sites to fill in a questionnaire about grid deployed cloud stuff security. This ticket was meant for 100IT - although shouldn't there be one for UKI-GridPP-Cloud-IC too? (I couldn't see such a ticket in the solved pile). I've assigned to uk-ngi ops and notified the site of the ticket. Assigned (4/8)
106615(2/7)
Decommissioning ticket for the FTS2 service at RAL on the 2/9/14. Nothing else to do really, on hold to closer to the time. On hold (14/7)
BRISTOL
106325(18/6)
CMS pilots losing their network connections. The Bristol admins are waiting to see how things pan out for a similar RAL ticket (106324) but I'm not sure if waiting for this is the right thing to do - it could be that the RAL problems are very RAL specific. On hold (14/7)
106554(29/6)
Another CMS ticket, about FTS backlogs between FNAL and Bristol. Although the original transfer has finished a connectivity problem still seems to persist and Lukasz has offered some suggestions and asked cms how they'd like to proceed. Waiting for reply (29/7)
Could these two issues be somehow related?
GLASGOW
107435(1/8)
CMS glideins were getting held up at Glasgow. Dave and the gang tracked down missing /cms/Role=pilot explicit mappings in their argus, and have added them in. Things are looking better, with the ticket now in the customary "How's it look on your end?" state. Waiting for reply (4/8)
ECDF
95303(1/7/13)
glexec deployment ticket. There's been some movement on the glexec tarball development front at last. Jeremy tweaked the reminder date and assigned person. On hold (29/7)
SHEFFIELD
107217(24/7)
Sheffield failing site-bdii checks due to the old "all the 4s" published waiting jobs problem (usually due to broken dynamic publishing). Ticket's acknowledged, and the expiry has been extended, but could do with a proper update soon. In progress (1/8)
LANCASTER
100566(27/1)
Poor perfsonar performance. After a reinstall of the node and establishing that there is no hard bottleneck we're stuck. Currently waiting on some network engineer time at Lancaster, whilst scratching our heads over why the perfsonar isn't working right for us. On hold (4/8)
95299(1/7/13)
Tarball glexec ticket. I've opened up a line with the glexec devs, who have been very helpful. They've given me some build tips, but then I went on holiday before I could use their advice. On hold (29/7)
UCL
101285(16/2)
Ben got the UCL perfsonar box back on its NICs, and is just waiting on getting it back into the WLCG mesh. Jeremy is on the case. Waiting for reply (29/7)
95298(1/7/13)
UCL's glexec ticket. Ben has stated this will have to wait until he is back from leave at the end of August. On hold (29/7)
RHUL
107436(1/8)
Atlas having transfer problems to RHUL. Govind has tracked down a grdftp mapping problem that solves some of the errors, the rest seem to be due to his new pool nodes misbehaving. Perhaps a problem with the configuration of the latest version of DPM? In progress (3/8)
QMUL
107440(2/8)
LHCB seem to be having problems getting files from the input sandbox on what appears to be all QM cream CEs. Chris is on the case. In progress (4/8)
107402(31/7)
Site BDII test failures. It looks like this problem has evapourated, Gareth has suggest that somebody at QM close the ticket if they're happy. Assigned (can be closed) (1/8)
Cloud-IC
106347(19/6)
The new cloud site was noted as hogging 12% of the cern statum one cvmfs connections. There was some discussion about this, and the Shoal installation at Oxford might well prevent this from happening again, but the site was down for maintenance so no confirmation could be made. When is the cloud site likely to be back in action? On Hold (14/7)
EFDA-JET
97485(21/9/13)
LHCB authentication errors at jet, which survived OS and EMI upgrades. I've been all talk and no trousers about getting round to helping jet out, has anyone seen anything like this? Or have any ideas? On Hold (29/7)
TIER 1
107416 (31/7)
The RAL FTS has been accused of hammering the US MWT2 srm. Andrew has suggested a course of action that might soothe things. Waiting for Reply (4/8)
106655 (4/7)
Castor failing ops tests. This was due to reasons, that are understood by clever people. A fix was delayed, but hopefully should roll out this week. In progress (31/7)
105405 (14/5)
Vidyp router firewall checking ticket. This ticket has been left fallow for a while, with some offline discussion between the Vidyo devs and RAL networking. Any news? On hold (1/7)
106324(18/6)
CMS pilots losing connection to their submission hosts. Firewall tweaks haven't fixed the problem, but a suggestion of of changing the pilot "keepalive" parameter was put forward. There seems to be some confusion on the cms end about the current state of this issue, but last word says it persists. In progress (30/7)
|