Grid Storage

From GridPP Wiki
Jump to: navigation, search

Grid Storage

WLCG moves and stores hundreds of petabytes of data - much of it replicas of other data, to ensure high availability and resilience. GridPP provides storage in WLCG via the Tier 1 at RAL and via its Tier 2 sites. The grid storage is based on the so-called "storage elements" (or SEs), which provide a grid interface to quite diverse storage systems. SEs include control interfaces like SRM, information interfaces based on LDAP (more or less), and interfaces for data transfer and data access.

GridPP includes a Storage and Data Management group, a group of which provides support for the SEs, data transfer protocols, performance and resilience. And interoperation. And is generally the fount of all knowledge, at least as regards grid storage and its use.

Information about data storage and management in GridPP

Information for users (or potential users)

Users of GridPP resources tend to be grouped together in collaborations (known as Virtual Organisations, or VOs), with a common research purpose. VOs can range in size from several thousand down to in principle a single person. The Main Page has more information about "joining the grid" as a new VO or new members of VOs.

As regards the data storage, sites can often at their discretion allocate resources as long as, say, only a few tens of terabytes are required. Experience has shown, however, that even small VOs can suddenly grow and will need stricter data management, so it is best to plan for growth by requesting - and getting - and using! - a space allocation (identified by a so-called space token.)

For "small" VOs (by data volume) it is considered best practice to start with data at only one or two sites. By construction, the grid should let you have data "on the grid" and not worry about where it is, but in practice you may be supported by site administrators, particularly in the beginning, and it may be best to start with two sites. (If you are a local VO, i.e. local to a site, that site would of course be your main resource and your main support.)

Information for sites (and prospective sites)

Simple: we have a weekly audio-conference-technology-du-jour meeting at 10.00 London time (and Edinburgh time, and Cardiff time, and Belfast time). If you are a site running storage for GridPP it is very highly recommended to join this meeting. Also, we have a mailing list, gridpp-storage at www.jiscmail.ac.uk. This list is open to all Tier 2 sites (and in fact has members from about ten different countries) but of course mainly focuses on GridPP.

Finally, but not leastly, keep up with our exciting technological technology on our blog: GridPP Storage Blog

Information for everybody else

Have a look at the technical but still somewhat informative - and certainly colourful - dashboard for the Tier 1. Read our technical but technically excellent blog, the GridPP storage blog.

GridPP expertise

Storage experts in GridPP have a lot of expertise with storage systems: Mostly, of course Grid storage, but also maintaining the underlying storage clusters, distributed filesystems, and data management for science in general.

Technology

Support

This page is a Key Document, and is the responsibility of Jens Jensen. It was last reviewed on 2014-03-18 when it was considered to be 10% complete. It was last judged to be accurate on (never).