Monday 5th November 14:00 GMT</br>
32 Open UK Tickets this week. It's the first Monday of the month, so we get to look at all of them. Have all the GGUS access problems experienced by atlas team members last week soothed themselves?</br>
It's worth noting that a quarter of the open tickets are concerning networking/transfer type problems.</br>
- UNSUPPORTED GLITE SOFTWARE TICKETS</br>
Congratulations to those sites who closed their tickets. I suspect these will be gone over in greater detail so again I'll just summarise them, we can look at each in the meeting if needed. All seem to be in hand, but my rule of thumb is the more recent the update the lesser the worry.
BRISTOL: https://ggus.eu/ws/ticket_info.php?ticket=87472 (17/10) In Progress (25/10)</br>
CAMBRIDGE: https://ggus.eu/ws/ticket_info.php?ticket=87470 (17/10) In Progress (30/10)</br>
BRUNEL: https://ggus.eu/ws/ticket_info.php?ticket=87469 (17/10) In Progress (30/10)</br>
UCL: https://ggus.eu/ws/ticket_info.php?ticket=87468 (17/10) In Progress (1/11)</br>
MANCHESTER: https://ggus.eu/ws/ticket_info.php?ticket=87467 (17/10) On Hold (24/10) In Progress (5/11)</br>
SHEFFIELD: https://ggus.eu/ws/ticket_info.php?ticket=87466 (17/10) On Hold (31/10)</br>
ECDF: https://ggus.eu/ws/ticket_info.php?ticket=87171 (10/10) In progress (30/10) (5/11)</br>
EFDA-JET: https://ggus.eu/ws/ticket_info.php?ticket=87169 (10/10) In Progress (31/10)</br>
https://ggus.eu/ws/ticket_info.php?ticket=87813 (25/10)</br>
Migration of vo.helios-vo.eu to Manchester. The transfer was completed manually,users were asked if things okay. In Progress, I "waiting for replied" it today. (30/10) David indicates it works and will now test with WMS/CE (5/11)
https://ggus.eu/ws/ticket_info.php?ticket=88112 (3/11)</br>
Slow atlas transfers, found to be caused by database problems. The problems have been fixed, the atlas instance restarted and data is flowing once more. Waiting for the thumbs up from atlas. Waiting for reply (5/11)
https://ggus.eu/ws/ticket_info.php?ticket=86690 (3/10)</br>
t2k are missing JPKEKCRC02 FTS ganglia metrics. There were some problems with the rrd files that meant they had to be deleted, which hopefully will fix the plots. Things look better to my eyes, In Progress, can be waiting for replied/solved (31/10) t2k give the thumbs up, seems okay to them now
https://ggus.eu/ws/ticket_info.php?ticket=86152 (17/9)</br>
Packet loss on the RAL perfsonar. This is being taken under the wing of wider network investigations at RAL. On hold (31/10)
https://ggus.eu/ws/ticket_info.php?ticket=68853 (22/3/11)</br>
DPM Sl4 retirement ticket. The only reason this is open is possible SL4 disk servers at Durham right? Are they still there? In progress (30/10)
https://ggus.eu/ws/ticket_info.php?ticket=88099 (3/11)</br>
atlas seeing transfer errors into RALPP with "No transfer markers received" errors, although the problem seems to be abating itself slowly. Still just "Assigned" (4/11) ATLAS still see problem (5/11). Still just assigned
https://ggus.eu/ws/ticket_info.php?ticket=88019 (1/11)</br>
lhcb seeing failures on some nodes, blaming cvmfs. Raul has put CE in downtime. In Progress (1/11)
https://ggus.eu/ws/ticket_info.php?ticket=88009 (1/11)</br>
Hone with one of their usual politely worded requests to get their jobs moving. Mark tweaked the batch system, and hone are happy again. In progress, can be closed (2/11) Solved
https://ggus.eu/ws/ticket_info.php?ticket=86105 (14/9)</br>
Poor sonar rates between Birmingham & BNL. Investigation made difficult due to EMI2 problems with the DPM, Brian has tried to see if doubling the number of steams would help. Did it? On hold (16/10)
https://ggus.eu/ws/ticket_info.php?ticket=88151 (5/11)</br>
apel nagios test problems. Assigned (5/11)
https://ggus.eu/ws/ticket_info.php?ticket=86242 (20/9)</br>
Biomed not cleaning out their cream sandbox. Mike pulled them up about this a while ago but no reply. We should close this ticket and/or re-ticket the VO if they're causing a mess. Waiting for reply (4/10)
https://ggus.eu/ws/ticket_info.php?ticket=84123 (11/7)</br>
atlas production job failures at Durham, which has become a bit of a catch-all ticket for atlas problems at Durham. On hold (3/9)
https://ggus.eu/ws/ticket_info.php?ticket=75488 (19/10/11)</br>
Compchem authentication ticket. On hold, but is it still relevant? (8/10)
https://ggus.eu/ws/ticket_info.php?ticket=88119 (4/11)</br>
Atlas transfer's are failing due to a sickly pool node. In Progress (5/11)
https://ggus.eu/ws/ticket_info.php?ticket=87958 (31/10)</br>
atlas transfers between Edinburgh & FZK having problems, likely due to their firewall. FZK had been ticketed (no ticket number given though). In Progress (1/11)
https://ggus.eu/ws/ticket_info.php?ticket=86334 (24/9)</br>
Poor atlas sonar rates between ECDF & BNL. Wahid has "harmonised" his tcp tunings, and is waiting on some further WAN upgrades. On hold (25/10)
https://ggus.eu/ws/ticket_info.php?ticket=87879 (29/10)</br>
na62 mapping problems, traced to a pool node not making its grid map. Seems things are fixed now, despite the user's initial protests to the contrary. Turns out they were just being impatient! In progress, can be closed (30/10) SOLVED
https://ggus.eu/ws/ticket_info.php?ticket=86996 (8/10)</br>
Sussex's APEL problems. Things look better now after a lot of work. In progress, can be closed (5/11)
https://ggus.eu/ws/ticket_info.php?ticket=81784 (1/5)</br>
The Sussex Certification Chronicle. Surely the Grid Overlords are satisfied that Sussex is worthy of certification, after paying so much tribute in tears and sanity? :-) In progress (bit quiet though) (23/10) SOLVED! SUSSEX IS ONE OF US NOW...
https://ggus.eu/ws/ticket_info.php?ticket=86306 (22/9)</br>
Hard-to-kill lhcb jobs at QMUL. Chris is still getting regular hit-lists. Chris's corresponding ticket to the cream developers (https://ggus.eu/tech/ticket_show.php?ticket=87891) has problems as lhcb can't reply to it! He has however written information in this ticket. In progress (1/11)
https://ggus.eu/ws/ticket_info.php?ticket=86108 (14/9)</br>
Perfsonar WAN bandwidth asymmetry. Been on hold for a while, the classic question must be asked - has the problem gone away all by itself? On hold (2/10)
https://ggus.eu/ws/ticket_info.php?ticket=86106 (14/9)</br>
Low atlas sonar rates between BNL and Oxford. Tweaking the FTS settings hasn't made any difference. The next step was to tweak tcp tuning perimeters. Duncan observed similar transfer rates between Oxford & TRIUMF. In progress (19/10) Tuning tcp didn't help, what to do next...
https://ggus.eu/ws/ticket_info.php?ticket=85367 (20/8)</br>
ilc jobs were aborting on one of Lancaster's CEs. This CE has poor performance, which for some reason was affecting ilc jobs more then most. The only fix is a reinstall (and reconfigure), but other priorities keep getting in the way (the latest being the use of this CE to test EMI2 tarballs). On hold (5/11)
t2k.org transfer timeout failures between RAL and Lancaster. Traffic is in the process of being routed over SJ5 from the lightpath to see if that helps. Other then that is the possibility that this is taking too long to stage from tape thing - but no reason why that's only being a problem for us. In progress (1/11)
|