2007-Q1 Transfer Tests

From GridPP Wiki
Revision as of 15:16, 24 January 2008 by Michael kenyon (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

As part of the LCG Dress Rehersals it's time to review the site readiness and capacity with a new round of transfer tests, similar to those performed as part of SC4

Overview

The Milestones set last time for Data Transfers was

  • Tier-2 Transfers (End of Summer 2006) -Target rate 250Mb/s, subject to external conditions
  • T1 to T2 - Target rate 300-500Mb/s
  • Inter T2 Transfers - Target rate 100 Mb/s

Milestones

For this round wewill be using the targets of:

  • T1 to T2 - Target rate 300Mb/s or better
  • Intra T2 - Target rate 200Mb/s or better reading / writing.

Timetable

dteam

  • Select at least 2 reference sites per T2 for initial T1 to T2 tests.
    • Scotgrid - Gla(dpm) Ed(dCache)
    • Northgrid - Lancs(dCache) Shef(dpm)
    • Southgrid - ox(dpm) bir(dpm) dCache in SouthGrid?- RAL
    • LondonT2 - IC-HEP(dCache) QMUL(dpm?)
  • test modestly with 50G - checking for major config snafus or v slow rates.
  • thrash overnight with 1T (1000*1G)

Then move onto canned tests within a T2 - round robin so that we get read/write rates between sites.


experiments

Clashing testing from experiments (potentially)

Results

Transfer Test Results - March 06. Source 1st Column.
ScotGrid
dest-> Glasgow Edinburgh Durham
RAL Tier1

Mar 05 - 50 Files
fail {{{speed}}}Mb/s PG

mar 06 - 50 Files
done 290Mb/s PG

mar 06 - 114 Files
done 251Mb/s PG

mar 06 - 500 Files
done 90Mb/s PG

mar 06 - 50 Files
done 263.5Mb/s PG

mar 06 - 557 Files
done 270Mb/s PG

Glasgow ---

mar 08 - 263 Files
done 229Mb/s PG

mar 08 - 1000 Files
done 226Mb/s PG

Edinburgh

mar 08 - 490 Files
done 506Mb/s PG

---
Durham ---
NorthGrid
dest-> Lancaster Liverpool Manchester Sheffield
RAL Tier1

mar 08 - 68 Files
done 99Mb/s PG

mar 12 - 10 Files
done 118Mb/s PG

mar 12 - 1000 Files
done 133Mb/s PG

Lancaster ---
Liverpool ---
Manchester ---
Sheffield ---
SouthGrid
dest-> Birmingham Bristol Cambridge Oxford Warwick RAL Tier2
RAL Tier1

mar 09 - 10 Files
failed {{{speed}}}Mb/s PG

mar 12 - 450 Files
cancelled 116Mb/s PG

mar 12 - 527 Files
cancelled 390Mb/s PG

mar 09 - 10 Files
done 215Mb/s PG

mar 09 - 307 Files
canceled 341Mb/s PG

mar 08 - 998 Files
done 276Mb/s PG

Birmingham ---
Bristol ---
Cambridge ---
Oxford ---
Warwick ---
RAL_Tier2 ---
London Tier2
dest-> Brunel UCL-HEP UCL-CENTRAL IC-HEP IC-LeSC QMUL RHUL
RAL Tier1

mar 09 - 10 Files
done 644Mb/s PG

mar 09 - 384 Files
canceled 420Mb/s PG

mar 09 - 10 Files
done 44Mb/s PG

Brunel ---
UCL-HEP ---
UCL-CENTRAL ---
IC-HEP ---
IC-LeSC ---
QMUL ---
RHUL ---


Issues Discovered

Mar 5

  • RAL -> GLA take 1 slow. Only channel file size was set to one.
glite-transfer-channel-list RALLCG2-UKISCOTGRIDGLASGOW
Number of files: 1, streams: 1

Upped that to 8 with a quick

glite-transfer-channel-set -f 8 RALLCG2-UKISCOTGRIDGLASGOW

Overall status of that test failed due to

 Source:      srm://ralsrma.rl.ac.uk:8443/srm/managerv1?SFN=//castor/ads.rl.ac.uk/prod/grid/hep/disk1tape1/dteam/j/jkf/castorTest/1GBcanned016
 Destination: srm://svr018.gla.scotgrid.ac.uk:8443/srm/managerv1?SFN=/dpm/gla.scotgrid.ac.uk/home/dteam/aetest10/tfr000-file00016
 State:       Failed
 Retries:     4
 Reason:       Failed on SRM get: Failed SRM get on httpg://ralsrma.rl.ac.uk:8443/srm/managerv1 ; id=818873759 call, no TURL retrieved for srm://ralsrma.rl.ac.uk//castor/ads.rl.ac.uk/prod/grid/hep/disk1tape1/dteam/j/jkf/castorTest/1GBcanned016
 Duration:    0

Retrying using 10 source files that are known good (and will repeat for other T1-T2 tests)


  • RAL->GLA Take 2
transfer 0 (721e56a6-cbc8-11db-9388-dffd0907342b)
50/50 (50000000000.0) transferred. Started at 9:53:55, Done at 10:15:52, Duration = 0:21:57, Bandwidth = 303.622157873Mb/s
 
Date of Submission was 6/3/2007
Total number of FTS submissions = 1
 50/50 transferred in 1379.30051994 seconds
 50000000000.0bytes transferred.
Average Bandwidth:290.002065697Mb/s
  • RAL -> EDI Take 1
./filetransfer.py --number=50 --uniform-source-size --ignore-status-error --delete --background \
 srm://ralsrma.rl.ac.uk:8443//castor/ads.rl.ac.uk/prod/grid/hep/disk1tape1/dteam/j/jkf/castorTest/1GBcanned00[0:9] \
 srm://srm.epcc.ed.ac.uk:8443/pnfs/epcc.ed.ac.uk/data/dteam/atest10/

Mar 23

Suspended dteam testing while CMS transfers in progress. Bottleneck at RAL firewall. Awaiting RAL network upgrade. See Ops Blog Posting

Channel Capacity

Realised that not all glite-transfer-channel-lists were created equal, so drew up a quick table (using this script):

Site From
RAL
Star To
RAL
UKI-NORTHGRID-SHEF-HEP 8 8 1
UKI-LT2-IC-LESC 1 1 1
UKI-SOUTHGRID-BHAM-HEP 8 5 1
UKI-SOUTHGRID-OX-HEP 8 8 1
UKI-LT2-IC-HEP 10 40 10
UKI-SOUTHGRID-CAM-HEP 8 8 1
UKI-LT2-UCL-HEP 1 8 8
SCOTGRID-EDINBURGH 8 8 1
UKI-LT2-UCL-CENTRAL 8 8 1
UKI-NORTHGRID-MAN-HEP 1 8 1
UKI-SCOTGRID-GLASGOW 8 20 10
UKI-NORTHGRID-LANCS-HEP 8 10 12
UKI-SOUTHGRID-BRIS-HEP 1 8 5
UKI-LT2-BRUNEL 8 8 8
UKI-SCOTGRID-DURHAM 1 8 1
UKI-SOUTHGRID-RALPP 8 8 8
UKI-LT2-RHUL 5 8 1
UKI-LT2-QMUL 2 3 1
UKI-NORTHGRID-LIV-HEP 1 8 1