FTS transfer failure study

From GridPP Wiki
Jump to: navigation, search

Here is an Analysis of failures mode rates for FTS transfers:

Table of Data

Summary Total ATLAS CMS LHCB T2K Month
Completed 861256 806031 34095 3138 17992 September
Completed 416816 373260 32082 10681 793 April
Completed 557990 484733 46257 5499 21501 May
Completed 774560 709013 59290 5743 514 June
CONNECTION_ERROR 32732 32544 173 14 1 September
CONNECTION_ERROR 2234 2002 227 5 0 April
CONNECTION_ERROR 6937 5773 1128 1 35 May
CONNECTION_ERROR 29188 25062 4123 3 0 June
CONSISTENCY_ERROR 0 0 0 0 0 September
CONSISTENCY_ERROR 3 3 0 0 0 April
CONSISTENCY_ERROR 0 0 0 0 0 May
CONSISTENCY_ERROR 3 3 0 0 0 June
FILE_EXISTS 2190 1326 54 0 810 September
FILE_EXISTS 878 554 319 5 0 April
FILE_EXISTS 3032 658 20 5 2349 May
FILE_EXISTS 1529 213 1280 0 36 June
GENERAL_FAILURE 24318 21984 833 1442 59 September
GENERAL_FAILURE 50069 48835 359 543 332 April
GENERAL_FAILURE 6442 1741 695 3981 25 May
GENERAL_FAILURE 9149 5628 3362 147 12 June
GRIDFTP_ERROR 13729 13022 693 1 13 September
GRIDFTP_ERROR 10505 9013 162 30 0 April
GRIDFTP_ERROR 4205 1665 2319 0 221 May
GRIDFTP_ERROR 13917 11736 2178 3 0 June
HTTP_TIMEOUT 4531 4439 73 19 0 September
HTTP_TIMEOUT 1856 1761 95 0 0 April
HTTP_TIMEOUT 2813 2563 228 0 22 May
HTTP_TIMEOUT 2022 1824 198 0 0 June
INTERNAL_ERROR 5309 5280 28 1 0 September
INTERNAL_ERROR 2975 2947 28 0 0 April
INTERNAL_ERROR 444 410 34 0 0 May
INTERNAL_ERROR 3194 2771 97 326 0 June
INVALID_PATH 5545 5534 10 0 1 September
INVALID_PATH 619 187 432 0 0 April
INVALID_PATH 2470 2343 127 0 0 May
INVALID_PATH 282 233 49 0 0 June
INVALID_SIZE 168 143 25 0 0 September
INVALID_SIZE 45 0 45 0 0 April
INVALID_SIZE 63 56 7 0 0 June
LOCALITY 1318 1295 23 0 0 September
LOCALITY 20 0 0 20 0 April
LOCALITY 646 0 102 544 0 May
LOCALITY 1893 1837 29 27 0 June
NO_SPACE_LEFT 676 0 676 0 0 September
NO_SPACE_LEFT 3 0 0 0 0 May
NO_SPACE_LEFT 0 0 0 0 0 June
REQUEST_TIMEOUT 39 39 0 0 0 September
REQUEST_TIMEOUT 21 18 3 0 0 April
REQUEST_TIMEOUT 126 122 4 0 0 June
SECURITY_ERROR 1651 721 926 4 0 September
SECURITY_ERROR 1539 661 519 0 359 April
SECURITY_ERROR 8664 4 3699 0 4961 May
SECURITY_ERROR 746 5 253 0 488 June
STORAGE_INTERNAL_ERROR 3 3 0 0 0 September
STORAGE_INTERNAL_ERROR 2 2 0 0 0 April
STORAGE_INTERNAL_ERROR 11 4 7 0 0 May
STORAGE_INTERNAL_ERROR 12130 6763 30 5337 0 June
TRANSFER_MARKERS_TIMEOUT 897 189 707 1 0 September
TRANSFER_MARKERS_TIMEOUT 1093 80 1013 0 0 April
TRANSFER_MARKERS_TIMEOUT 1060 40 1016 4 0 May
TRANSFER_MARKERS_TIMEOUT 2669 182 2487 0 0 June
TRANSFER_TIMEOUT 1605 1412 183 9 1 September
TRANSFER_TIMEOUT 2604 410 1972 222 0 April
TRANSFER_TIMEOUT 307 20 285 2 0 May
TRANSFER_TIMEOUT 3709 2078 1631 0 0 June
USER_ERROR 928 418 0 0 510 September
USER_ERROR 137 98 18 1 20 April
USER_ERROR 3291 5 0 0 3286 May
USER_ERROR 1081 454 627 0 0 June
Total errors 292289 229137 36914 12696 13542

Analysis

  • Percentage completion rate per month are as follows:
V0 April May June September April-June
ATLAS 82.17% 96.85% 92.3% 90.12%
CMS 79.89% 79.02% 78.38% 88.56%
LHCB 92.27% 17.49% 49.57% 68%
T2K 10.34% 49.31% 48.95% 92.80%
Summed 82.11% 92.76% 90.46% 90.0%
  • Failure rates have reduced since April but is now "stable" at 1 in 10 transfers.
  • For Failure Modes, some have increased in rate, some have reduced.
    • Behaviour is different for Various VOs.