Tier-2 SSD Study

From GridPP Wiki
Jump to: navigation, search

Background

There are several potentially i/o bound services within a the typical LHC Tier-2. In the front-end, those services backed by a database (Storage Elements, Logging and Bookkeeping Service), or with high instantaneous write demand (Cream CEs on job submission) are obvious candidates. In the back-end, worker nodes running many single-threaded jobs and storage nodes delivering many simultaneous files can both exhibit i/o limited efficiency.

Why SSDs?

Solid State Disk technologies are increasingly replacing high-speed hard disks as the storage technology in high-random-i/o environments. Because the rate at which they can seek to a given piece of data is limited only by the addressing system, rather than the physical rotational speed of a platter, read performance of SSDs can be significantly greater than that for even the fastest hard disks. Unfortunately, write performance does not scale as effectively - the internal architecture of Flash memory means that the minimum write block is much larger than a typical read block. In addition, the physical mechanism employed in Flash memory for write operations is power-inefficient, and is intrinsically much slower (and more power hungry) than reading. (Thus, expensive SSDs still use DRAM for write caching, just as HDDs do).

SSDs in WNs

We tested the performance of two commercial SSDs (Kingston SSDnow V-series 128MB "Kingston SSD", and Intel G2 X25-M 160GB "Intel SSD") as physical mount points for the /tmp directories on Worker nodes at Glasgow. In addition, we compared their performance, not just with single-HDD solutions, but also pairs of HDDs in RAID0 and RAID1 configurations. The performance testing was undertaken with ATLAS HammerCloud infrastructure.

Performance in ATLAS HammerCloud tests

A mix of HammerCloud tests was performed, both filestaging and direct-io.


FileStager (with ATLAS root files "optimally reordered" link here)
Node names Storage type Number of cores/jobs Mean job efficiency over test Mean throughput over test
node300 - 302 HDD 8 0.817319301592957 6.53855441274365
node303 - 304 Kingston SSD 8 0.625732554992478 5.00586043993983
node305-309 SSD 8 0.764859565799084 6.11887652639267
node310 Magny-Cour/SSD 24 0.48736505697495 11.6967613673988
FileStager (with ATLAS root files "optimally reordered")
Node names Storage type Number of cores/jobs Mean job efficiency over test Mean throughput over test
node300 - 301 HDD 8 0.818920715706619 6.55136572565295
node302 RAID0 HDD 8 0.884657582698238 7.0772606615859
node303 - 304 Kingston SSD 8 0.6 4.8
node305 - 309 SSD 8 0.8 6.4
node310 Magny-Cour/SSD 24 0.45 10.8
FileStager (with ATLAS root files "optimally reordered")
Node names Storage type Number of cores/jobs Mean job efficiency over test Mean throughput over test
node300 - 301 HDD 8
node302 RAID1 HDD 8 0.772064020762474 6.17651216609979
node303 Kingston SSD 8
node305 SSD 8
node310 Magny-Cour/ RAID0 HDD (x2) 24 0.830322004792 (x2 correction*) 19.927728115008
Direct rfio access ("DQ2_LOCAL", with ATLAS root files "optimally reordered")
Node names Storage type Number of cores/jobs Mean job efficiency over test Mean throughput over test
node300 - 301 HDD 8 0.781380579023178 6.25104463218542
node302 RAID1 HDD 8 0.735091332801605 5.88073066241284
node303 Kingston SSD 8
node305 SSD 8
node310 Magny-Cour/ RAID0 HDD (x2) 24 0.72673534800392 (x2 correction*) 17.4416483520941
 *Magny-Cour results required some correction due to a broken system clock.

With pCache

For more information on pCache testing, see the ATLAS pCache study. It is clear that pCache with good cache behaviour will increase the importance of having good IOPs and bandwidth on the device hosting the cache.

Long Term Tests

After the initial testing phase completed, the Intel SSDs were left in the majority of the test WNs, and the WNs left accessible to general job loads. The majority of the cluster was later retrofit upgraded to RAID0 HDD mounts, in light of the previous tests. As a result, we can provide long term performance comparisons for the WNs to date.

SSDs in Service Nodes

As a Database host

?