Monday 3rd December 2018, 14.00 GMT
39 Open Tickets this month, going site by site:
RALPP
138588 (28/11)
A CMS ticket about SRM timeouts, Ian has marked it in progress but no update with words in it yet. In progress (29/11)
131616 (3/11/17)
RALPP's v6 ticket. After having to rollback the last attempt Chris is having another go today after a router firmware update. Good luck! In Progress (3/12)
OXFORD
131615 (3/11/17)
Just the v6 ticket at Oxford, no recent news but Duncan asked some questions last week, advertising some JISC services that could make life easier. On Hold (29/11)
BRISTOL
138402 (21/11)
An LHCB tickets for failed pilots. Initial problems seemed to be with the RAL BDII not giving the right (or any) information for Bristol, but this has been fixed and jobs seem to be failing with connection errors. Winnie and Lukasz are working on it. In Progress (1/12)
138041 (1/11)
CMS transfers failing from Bristol. Files are on disk but not in the DPM namespace - waiting on a fix to the DPM shell to proceed with this. I think I'm waiting on the same update. In progress (30/11)
131613 (3/11/17)
Bristol's v6 ticket. Has there been any recent progress here? In progress (9/10)
BIRMINGHAM
137801 (17/10)
The ticket tracking the decommissioning of the old Birmingham DPM. All proceeding as expected, switch off date is the 10th of December. In progress (26/11)
138244 (12/11)
Availability ticket, which will continue to be alarming during the decommissioning process. On Hold (12/11)
131612 (3/11/17)
Birmingham's v6 ticket. No updates since August, any news Mark? Even a confirmation of there being no news would be useful. On Hold (27/8)
GLASGOW
134689 (23/4)
Perfsonar upgrade to CentOS7 ticket. On hold whilst trying to get v6 to work. On Hold (30/10)
131611 (3/11/17)
Glasgow's v6 ticket. v6 was enabled but the v6 packets aren't flowing. Any updates on diagnosing/fixing this? In progress (22/10)
EDINBURGH
138243 (12/11)
ROD Availability ticket, caused by a late lcg-CA Package update. Metrics are on the mend, but Andy added his thoughts on the reliability of these reliability metrics. On Hold (19/11)
131610 (3/11)
The ECDF v6 ticket. Any news on your next Ipv6 rollout plans? In progress (10/9)
DURHAM
134687 (23/4)
Request to update Perfsonar to CentOS7. Adam gave a plan to do this in the site's big C7 rollout, expected at the start of next year. Luckily not long off now! In progress (6/11)
131609 (3/11/17)
Durham's v6 ticket. After painting a bleak picture of mid-2019 as the earliest they could expect a full v6 rollout at Durham Duncan has asked some questions to try to help things along. In Progress (should be On Hold?) (29/11)
SHEFFIELD
131608 (3/11/17)
Just the v6 ticket at Sheffield. A positive update from Elena at the end of October hoping to dual-stack the perfsonar boxes by mid-November. Have you managed to do this yet? Do you need a hand with anything? In progress (30/10)
MANCHESTER
137112 (11/9)
Atlas SRM space reporting broken by a dodgy drain moving data outside of tokens. A repair script has been running for a long time, and it should be just about fixed. Alessandra has asked Tim to check the atlas-eye view of the space reporting to see if this issue can be closed. Waiting for reply (29/11)
131607 (3/11/17)
Manchester's v6 ticket. Some good news here, with a new v6 range being put into production and a hope that the storage will be dual-stacked for Christmas. Nice. In progress (3/12)
LIVERPOOL
131606 (3/11/17)
Only the v6 ticket at Liverpool. No news for a long while on this ticket though. In progress (4/6)
LANCASTER
138365 (19/11)
Providing storage dumps for the t2k files at Lancaster as a precursor to the move to the DFC. It's proving more difficult then initially thought, in part due to a lot of files not having their checksum information in them. Getting DPM to calculate and store these has been more of a pain then it should have been. In progress (3/12)
137996 (30/10)
A ROD ticket for a failed webdav test. Waiting on a new patch for DPM that will fix the dodgy behaviour. No sign of it yet though. On Hold (5/11)
136635 (9/8)
Availability ROD ticket. Just a few (smooth) days away from being able to close this one. On Hold (5/11)
RHUL
131603 (3/11/17)
Just the v6 ticket for RHUL. Last word is that central IT are outsourcing v6 DNS to JANET. Any news on how this is going? We'd like to hear more on this experience. In Progress (29/10)
QMUL
138364 (19/11)
The QM ticket for the T2K DFC migration. Dan was quick to provide the dump, and has tried to migrate the data (I think mainly successfully?). Just needing to clear up some details. In progress (28/11)
134573 (17/4)
CMS request to install singularity. This is waiting on the move to CentOS7 at QM, which currently has a test setup with a pre-production queue hopefully coming before Christmas. On Hold (5/11)
IMPERIAL
138360 (19/11)
The Imperial ticket for the T2K DFC migration. On Hold after the files have been removed and the file dumps provided. On Hold (3/12)
138359 (19/11)
Master ticket for all the T2K DFC migrations. Since I started writing this Daniela added child tickets for:
Oxford: 138647
Liverpool: 138648
RALPP: 138651
Sheffield: 138649
BRUNEL
138498 (26/11)
LHCB not able to access a Brunel ARC CE. This ticket appears to have been missed. Assigned (26/11) Update - the failures were due to an Arc Update, but jobs are running fine now. Solved.
133956 (9/3)
CMS asking to update the Brunel xroot configs. Raul updated DPM last week and hoped to enable DOME (which will enable the xroot changes) later in the week. Any joy? In Progress (26/11)
100IT have a ticket: 137306
TIER 1
138361 (19/11)
The Tier 1's T2K migration to the DFC ticket. Alastair provided the file dump over the weekend. In progress (1/12)
138493 (26/11)
CMS transfers failing from RAL to T2_CH_CERN. Turned out to be one bad file originally, but now more have appeared. Reopened (3/12)
138500 (26/11)
A similar ticket, CMS transfers failing from T2_PL_Swierk to RAL. This one has been bounced to RAL, I don't if that was right or fair. In progress (28/11)
138613 (29/11)
CMS asked to check a file that was failing to stage from tape. It looks like the file isn't in Castor at all. In Progress (3/12) Update - file being globally invalidated
138584 (28/11)
CMS xroot reads timing out for SAM tests. It looked to be an intermittent problem, and might have disappeared over the weekend. In progress (30/11)
138461 (22/11)
Winnie's ticket concerning Bristol's old bdii being "stuck" in the RAL Top BDII. Any word on his from the RAL BDII admins? I admit I haven't digested the lcg-rollout thread(s). In progress (26/11)
138033 (1/11)
Atlas singularity jobs failing at RAL. It looks like some progress was made on this last week from both sides. In progress (30/11)
137897 (23/10)
enmr jobs not being accounted at RAL, but it looks to be that's because they never successfully ran. A ticket has been submitted to dirac (138414) to get to the bottom of this. In Progress (28/11)
137822 (18/10)
LHCB ticket regarding the FTS being in a "bad state". Waiting to restart the castor -> echo migration to test to see if the problems can be duplicated, as they appear to happen under heavy load on the RAL FTS. On Hold (22/11) Update - transfers seem to be working fine now, so the ticket has been closed.
|