Monday 4th March 2013 14.45 GMT</br>
38 Open UK tickets today. All was going smoothly until the EMI1 tickets hit us, still the reply to them was swift from sites. It's the start of the month, so I need to take a break from Spring cleaning my desk (the horrors that I have seen) and take a look at all the tickets.
EMI 1 Tickets:</br>
(I won't go into much detail as they're likely to be talked about elsewhere and they only came out this morning.)
RALPP https://ggus.eu/ws/ticket_info.php?ticket=91997 (In progress) - Plan in place
OXFORD https://ggus.eu/ws/ticket_info.php?ticket=91996 (In Progress) - Is the deadline to upgrade the end of April, or do we need to be sorted before then?
BRISTOL https://ggus.eu/ws/ticket_info.php?ticket=91995 (In progress) - Winnie has asked for clarification for what's going on.
BIRMINGHAM https://ggus.eu/ws/ticket_info.php?ticket=91994 (In progress) - Mark will get onto this as soon as Birmingham's AC starts behaving.
GLASGOW https://ggus.eu/ws/ticket_info.php?ticket=91992 (In progress) - There are some red herrings at Glasgow due to hanging CE bdiis. Just the WMSes and LB to go, these are being handled.
SHEFFIELD https://ggus.eu/ws/ticket_info.php?ticket=91990 (In progress) - Elena plans to upgrade this month.
RHUL https://ggus.eu/ws/ticket_info.php?ticket=91987 (Assigned)</br>
https://ggus.eu/ws/ticket_info.php?ticket=91982 (Assigned)</br>
https://ggus.eu/ws/ticket_info.php?ticket=91981 (Assigned)</br>
(Poor RHUL getting 3 tickets - I assume this is the ROD dashboard being silly as Daniela mentioned)</br>
The real ticket: https://ggus.eu/ws/ticket_info.php?ticket=92111
LIVERPOOL https://ggus.eu/ws/ticket_info.php?ticket=91984 (In progress) - The Liver lads are working on it.
QMUL https://ggus.eu/ws/ticket_info.php?ticket=91980 (In Progress) - Chris has updated his BDII, so hopefully things will be sorted.
IC https://ggus.eu/ws/ticket_info.php?ticket=91978 (In Progress) - wms updated, last CE has a scheduled downtime, um, scheduled.
BRUNEL https://ggus.eu/ws/ticket_info.php?ticket=91975 (In Progress) - Raul plans to upgrade things at the end of the month. He asks about dangers upgrading the CE from EMI1 to 2 - Daniela replies that the DB change means that it's recommended to drain your CE first.
TIER 1 https://ggus.eu/ws/ticket_info.php?ticket=91974 (In Progress) - The team plan to have all services updated by the end of March.
Atlas data moving tickets:</br>
https://ggus.eu/ws/ticket_info.php?ticket=90242 (Lancaster)</br>
https://ggus.eu/ws/ticket_info.php?ticket=90243 (Liverpool)</br>
https://ggus.eu/ws/ticket_info.php?ticket=90244 (RALPP)</br>
https://ggus.eu/ws/ticket_info.php?ticket=90245 (Oxford)</br>
https://ggus.eu/ws/ticket_info.php?ticket=89804 (Glasgow)</br>
Nearing the end of these. Lancaster and Oxford are down to their last few files (which might need to be manually fixed at the site end- the one left at Lancaster is lost for good). RALPP similarly have dark data files that might need to be cleaned up locally. Liverpool are waiting on atlas after giving them a new list of files. Glasgow have been asked for a fresh file dump.
The Rest:
TIER 1</br>
https://ggus.eu/ws/ticket_info.php?ticket=91687 (21/2)</br>
Support for the epic VO on the RAL WMS. Request for pool accounts went out but no word since. In progress (21/2)
https://ggus.eu/ws/ticket_info.php?ticket=91658 (20/2)</br>
Request from Chris W for webdav redirection support on the RAL LFC. As reported last week waiting on the next release which has better, stronger, faster webdav support in it. In Progress (22/2)
https://ggus.eu/ws/ticket_info.php?ticket=91146 (4/2)</br>
atlas tracking RAL bandwidth issues. The ticket was waiting on last week's downtime to hopefully sort things out. Did the picture improve? In progress (12/2)
https://ggus.eu/ws/ticket_info.php?ticket=91029 (30/1)</br>
Again from atlas, this is the FTS queries failing for some jobs involving users with odd characters in the name ticket. A fix either needs to be implemented by the srm developers or atlas need to workaround by changing their robot DNs. On hold (27/2)
https://ggus.eu/ws/ticket_info.php?ticket=90528 (17/1)</br>
Sno+ Jobs weren't making their way to Sheffield, tracked to a problem with one wms. As the cause of the problem is unknown and completely unobvious it was suggested restricting Sno+ jobs to the working WMS, but still no reply from Sno+. Waiting for reply (19/2)
https://ggus.eu/ws/ticket_info.php?ticket=86152 (17/9/2012)</br>
Correlated packet loss on the RAL Perfsonar host. Did last week's network intervention fix things? Or maybe the problem evapourated (I'm ever the optimist)? On hold (16/1)
IMPERIAL</br>
https://ggus.eu/ws/ticket_info.php?ticket=91866 (28/2)</br>
It looks like atlas jobs were running afoul of some cvmfs problems on some nodes. They've been given a kick, it's worth seeing if the problem has gone away. In progress (28/2)
GLASGOW</br>
https://ggus.eu/ws/ticket_info.php?ticket=91792 (26/2)</br>
Atlas thought that they had lost some files, but it turns out that they just had bad permissions on a pool node (root.root) - the problem's been fixed and Sam is investigating with his DPM hat on, whilst checking the filesystems for more possible bad files. In progress (4/3)
https://ggus.eu/ws/ticket_info.php?ticket=90362 (13/1)</br>
All Glasgow's CEs have been switched over to use the GridPP voms server for ngs.ac.uk, they just need some testing. Solved (4/3).
SHEFFIELD
https://ggus.eu/ws/ticket_info.php?ticket=91770 (25/2)</br>
lhcb complaining about the default value being published for Max CPU time. No news from Sheffield beyond the acknowledgement of the ticket. In Progress (25/2)
DURHAM</br>
https://ggus.eu/ws/ticket_info.php?ticket=91745 (24/2)</br>
enmr.eu having trouble with lcg-tagging things at DUrham. Mike gave this a kick, and asked if the problem has gone away. Waiting for reply (25/2)
RHUL</br>
https://ggus.eu/ws/ticket_info.php?ticket=91711 (21/2)</br>
atlas having trouble copying files into RHUL. It's being looked at but PRODDISK and ANALY_RHUL have been put offline. In Progress (28/2)
https://ggus.eu/ws/ticket_info.php?ticket=89751 (17/12/12)</br>
Path MTU discovery problems to RHUL. On hold since being handed over to the Network guys, who were following it up with Janet. On hold (28/1)
LANCASTER</br>
https://ggus.eu/ws/ticket_info.php?ticket=91304 (8/2)</br>
LHCB having trouble on one of Lancaster's cluster as they like to run their jobs in the home directory rather then $TMPDIR. Forcing this behaviour is harder then it should be in LSF, so it looks like we're going to have to relocate the lhcb home directories. In Progress (1/3)
https://ggus.eu/ws/ticket_info.php?ticket=90395 (14/1)</br>
dteam jobs failed at Lancaster, due to our old CE being rubbish. Its since been reborn with new disks, but embarrassingly I haven't found the time to set a UI up for dteam and test it myself (which I intend to do as part of testing the UI tarball, but that's a whole other story). In progress (18/2)
ECDF</br>
https://ggus.eu/ws/ticket_info.php?ticket=90878 (27/1)</br>
lhcb were having problem with cvmfs at Edinburgh, but the fixes attempted can't be checked due to dirac problems at the site. In progress (could be knocked back to waiting for reply) (28/2)
BRISTOL</br>
https://ggus.eu/ws/ticket_info.php?ticket=90328 (11/1)</br>
The Bristol SE is publishing some odd values - zero used space. Waiting on another, similar ticket (90325) to be resolved. On hold (11/2)
https://ggus.eu/ws/ticket_info.php?ticket=90275 (10/1)</br>
The CVMFS taskforce have asked for Bristol's CVMFS plans. One Bristol CE is migrated to using it, with one left to go. On hold (5/2)
EFDA-JET</br>
https://ggus.eu/ws/ticket_info.php?ticket=88227 (6/11/2012)</br>
Jet have exhausted all options trying to fix this biomed job publishing problem. They're looking at reinstalling the CE to fix it, which seems like using a sledgehammer to crack a walnut (but I don't have any better ideas). On hold (25/2) Daniela suggests assigning issue to the developers.
|