Monday 20th October 2014, 14.30 BST
Non-LHC VO Nagios Failures:
VO Nagios
Liverpool, Lancaster (we're getting better), Sheffield, EFDA-JET, The Tier 1, Bristol (in downtime) and Cambridge are on "the list". Most are transient, load based errors. gridpp, pheno and southgrid seem to be the VOs having most problems.
We're up to 30 Open UK Tickets this week.
TIER 1
109276(11/10)
Submissions to the FTS3 REST interface was failing for some, probably after the certs or crls got stale. Andrew L suggested implementing an httpd restart which Maarten suggested was overkill - but anyhoo the submitter has come back to say that he hasn't seen a problem all week, so this ticket can likely be closed. In progress (20/10)
108845(27/9)
Just a heads up that this atlas transfer failure ticket has been reopened. Reopened (18/10)
RALPP
109360(15/10)
This SNO+ ticket, about failing nagios tests at RALPP, hasn't been noticed yet. Assigned (15/10) Update - it was a ticket meant for the Tier 1 all along, In Progress now - actually, waiting for reply
SHEFFIELD
109207(8/10)
SNO+ would like the VO_SW_DIR environmental variable to point to cvmfs - I know Elena has looked at this, any progress? In progress (9/10)
Similar story with another Sno_ ticket at Sheffield:
109223(9/10)
BRUNEL
109379(16/10)
SRM Nagios test failures. It looks like Brunels SE is in a dodgey state - too many ftp connection failures have been seen in the gridftp logs, httpd causing heavy load, possible SELinux problems after DB move. I'm sure if anyone has any input on this it would be appreciated. In progress (17/10)
IMPERIAL/DIRAC
108723(23/9)
I think this ticket from Chris W, containing questions for the DIRAC team, can be closed in favour of the new line of communication Daniela set up (https://mailman.ic.ac.uk/mailman/listinfo/gridpp-dirac-users). Waiting for reply (7/10) Update - closed.
ECDF AND GLASGOW
Two very similar LHCB cvmfs tickets at these sites, any chance of a link? Or perhaps just a coincidence?
ECDF: 109440
GLASGOW: 109439
Update - probably not, the Edinburgh ticket is now closed.
Another Update
I think that the SnoPlus ticket asking for srmcp help for SUSE users (107880) can be closed now, thanks to Duncan's tip for getting srmcp to work (num_steams=1).
Another another Update
The 100IT ticket (108356), about making fedcloud.egi.eu available at the site, has been updated by David B after some silence - currently waiting for a reply. Perhaps a sign that the process needs better documentation, in the event any sites go down this rabbit hole?
|