QCDGrid: Probing the building blocks of matter with the power of the Grid

[Project home] [Deliverables]

Background

Quantum chromodynamics (QCD) is the study of the building blocks of our universe. Both in the UK and around the world, scientists are developing techniques to quantify the complex behaviour of fundamental particles called quarks and gluons - the constituents of all nuclear matter. Computationally intensive simulations of these particles generate Terabytes of data that then has to be analysed to extract the key physical properties. This simulation and analysis relies upon access to state-of-the-art high performance computing resources. The Terabytes of raw physical data created in this field and the complex metadata required to describe then give rise to significant storage and access problems that are the focus for this project.

UKQCD members
Map showing members of the UKQCD collaboration.
UKQCD is a collaboration of leading particle physicists from centres around the United Kingdom, working as part of the wider GridPP programme. Since 2002, software engineers at EPCC have been developing DiGS (Distributed Grid Storage), a data management system that combines the distributed resources of the collaborators into a robust facility called the UKQCD Grid. The result is a multi-Terabyte storage facility over the UKQCD sites at: Edinburgh (including the University of Edinburgh Advanced Computing Facility), Liverpool, RAL, Southampton, Swansea, and Glasgow.

The facility is based on commodity hardware and open-source software. The hardware consists primarily of Unix/Linux computers managing large RAID storage arrays. On top of this infrastructure, the DiGS software (built with the Globus Toolkit, EGEE application stack, and an XML database) is used to manage the grid. It provides a simple and intuitive environment that hides the complexities of the underlying grid and presents a standard file system to the user. It incorporates a robustness metric that automatically disperses datasets across the grid, providing a resilience that ensures data is not affected by the loss of one (or possibly more) storage nodes.

UKQCD Grid Deployment Architecture
UKQCD Grid Deployment Architecture (click for enlargement)
DiGS allows the user to query an associated Metadata Catalogue using a GUI browser, locating and automatically retrieving datasets based on a query definition. The software provides a Job Submission System that allows a user to schedule computations on remote HPC systems, from the comfort of their PC. Security is leveraged from the Grid Security Infrastructure (GSI), using X.509 digital certificates to authenticate and authorise user requests. The result is a reliable, secure data management system.

DiGS aims to ensure the reliability and integrity of the data. To this end, every piece of data and metadata is replicated on at least two nodes of the grid. Every node ensures a further layer of data integrity by the use of various RAID configurations. The software also ensures consistency between the metadata catalogue and the actual data. Moreover, the software tools have been designed with the user in mind to ensure there are no barriers to the rapid acceptance of the new system: to the user, the interface is the grid.

UKQCD have prepared a Macromedia Flash presentation that illustrates the usage of DiGS software.

screenshot of metadata catalogue
Screenshot of the ILDG Browser Client in use (click for enlargement).
The UKQCD Grid has been operational since autumn 2002. The storage capacity is (in February 2009) around 80 Terabytes. The service hosts over 60,000 simulation datasets occupying approximately two thirds of the available capacity.

UKQCD have prepared an XML Application to define the format of the metadata documents in an extensible and scientifically meaningful manner. This metadata allows users of the grid to search for data matching a set of parameters. This has the potential to avoid duplication of effort.

The UKQCD Grid is integrated with similar activities in the International Lattice Data Grid (ILDG), allowing like-minded scientists around the world to share their data and benefit from the scientific progress of other groups. The multi-national data grid is being built on web service technologies.

Outreach and knowledge transfer

More information about outreach and knowledge transfer activities is provided on the DiGS homepage.

Further information and resources

Click here for access to the project Log Book plus the current set of project deliverables, publications, and other presentation material.

Contact

If you would like to learn more about the QCDgrid project or have any comments/questions, you may contact qcdgrid-enquiries@epcc.ed.ac.uk.


Last modified Wed  1 December 2010 . View page history
Switch to HTTPS . Website Help . Print View . Built with GridSite 1.4.3
For more about GridPP please contact Neasan O'Neill