Introducing The Grid and GridPP
From Web to Grid - the Next IT Revolution
|
GridPP is a collaboration of Particle Physicists and Computing Scientists from 19 UK universities, Rutherford Appleton Laboratory and CERN, who are building a Grid for Particle Physics. Funded by the government, through STFC, it is the UK's contribution to the international collaboration building a worldwide Grid, the wLCG. |
This introduction answers some frequently asked questions about the Grid:
What is the Grid?
Why do we need the Grid?
What makes the Grid different from the Web or Internet?
What is GridPP doing?
Who and what is involved in GridPP?
Who else is using Grids?
It will also discuss Grid middleware, running a Grid, Grids beyond particle physics and Gridsite, a website tool developed by GridPP.
What Is The Grid?
"The Grid" is the next leap in computer interconnectivity.
The Internet and the World Wide Web are increasingly an integral part of people's lives, helping the world share information and transfer data quickly and easily.
In the same way as we now share files and facts over the global network of computers, in the future the Grid will let us share other things, such as processing power and storage space.
Imagine sitting down at your computer with only a screen, keyboard and mouse but still having limitless computing power. No need for a bulky tower, no need to upgrade to the latest processing chip. This is what the Grid promises to the ordinary home user. But what is the Grid? How does it work? What are its other applications? To answer these questions we need to see what is driving the move towards Grids and then to what use will we be putting this international virtual-supercomputer.
During the late 60s the US government developed a Wide Area Network (WAN) to help scientists around America communicate more efficiently. It was a prototype Internet. 20 years on, by the late 80s, the Internet was still the preserve of academics and high-level computer users. This all changed with the introduction of the World Wide Web. Tim Berners-Lee invented the web at CERN in 1989. It was designed to make it even easier for scientists in different locations to work together. In the 15 or so years since it was first put online, the web has gone from a tool of a select few to a publicly accessible tool.
While an amazing facility, the web merely distributes information. The next step is to share all resources (computing power and data storage as well as information), on a global scale. For this we need Grid computing. The vision is that once connected to the Grid, the end user will see it essentially as one large computer system. So that in the future computer services could become a utility like electricity, paying for what you use as an on-demand service.
There are some other very good introductory Internet resources on the Grid and GridPP.
- CERN's Grid Cafe is an excellent starting point for more information about the Grid.
- The Grid Guide is another resource highlighting some of the projects and institutes working on or using the Grid.
- For current information and news about what is going on in Grids today check out the weekly online newsletter iSGTW
- A four page flyer.
- A presentation for schools.
- A talk explaining the Grid and GridPP.
There are some general articles on Grids too:
- The US National Centre for Supercomputing Applications have a website which asks "What is the Grid?" with videoclips of experts talking about the "Grid".
- "Grid Today" has an article "What is the Grid? A three-point checklist" by Ian Foster. He talks about how "Grids" are defined, and discusses the difference between real Grids and projects that just have "Grid" in the name.
GridPP have produced:
Return To The Top
|
The Grid is a practical solution to the problems of storing and processing the large quantities of data that will be produced by industry and the scientific communities over the next decade. Particle physicists are waiting for a new particle accelerator to start at the world's largest particle physics laboratory, CERN. The Large Hadron Collider (LHC) will be the most powerful instrument ever built to investigate fundamental physics. Once this is fully functional the amount of data being produced will be massive. All this will be too much for one institution to handle so they need to share resources i.e. to use distributed computing. |
Return To The Top
What Makes The Grid Different From The Web Or Internet?
The Grid is built on the same Internet infrastructure as the web, but uses different tools.
Middleware is one of these tools.
In a stand alone computer the resources allocated to each job are managed by the operating system e.g. Windows, Linux, Unix, Mac OS X.
Middleware is like the operating system of a Grid, allowing users to access resources without searching for them manually.
GridPP has developed middleware for the Grid, in collaboration with other international projects. Due to GridPP's open source policy, the middleware can evolve and be improved by the people who use it.
Return To The Top
|
Distributed computing has been available to scientists for some time but, in general, the use of different sites has to be negotiated by each scientist individually. They need a separate account on each system and jobs have to be submitted and results collected back by hand. Current distributed computing means the user has a lot of work to do to get their results. This is where the idea of Grid computing comes in. |
| Middleware lets users simply submit jobs to the Grid without having to know where the data is or where the jobs will run. The software can run the job where the data is, or move the data to where there is CPU power available. Using the Grid and middleware, all the user has to do is submit a job and pick up the results. |
|
Acting as the gatekeeper and matchmaker for the Grid, middleware monitors the Grid, decides where to send computing jobs, manages users, data and storage.
It will check the identity of the user through the use of digital certificates. A digital certificate is a file stored securely on a users computer which allows the Grid to correctly identify a user. The certificates are given to a user by the Certification Authority, with numerous steps to ensure the person applying is who they say they are. The middleware automatically extracts the users' identity from their digital certificate and uses this to log them in. This means users don't have to remember user names and passwords to log onto the Grid, they're automatically logged on using their Grid certificate. After this seamless identification process the middleware will find the most convenient and efficient places for the job to be run and organise efficient access to the relevant scientific data.
It deals with authentication to the different sites being used, runs the jobs, keeps track of progress, lets the user know when the work is complete and transfers the result back.
So for scientists the Grid will enable them to treat the distributed resources as one integrated computer system, with one single log on, and the middleware will handle all the negotiations, the submission of jobs and the collation of the results.
Return To The Top
Running the Grid
Using middleware is only the start of getting a Grid working.
For a functioning Grid you also need:
- A team to manage the day to day running of the Grid, making sure sites are working.
- Software that lets Grid users submit their computing jobs to the Grid
The Deployment and Operations team for GridPP makes sure that new versions of the middleware are properly installed at each site and ensures that the Grid is operating correctly at all times. This includes monitoring the Grid, to make sure sites are working properly, and even going out to individual sites to help them get started. For an Internet user, it doesn't matter what operating system, browser, processor speed or platform they're using. GridPP are trying to emulate this ease of use - although there is some way to go.
The particle physics Grid will be needed to run the highly complex calculations that will be necessary when the LHC starts. The LHC itself will accelerate particle beams at each other at enormous energies. Huge detectors will then monitor collisions between the particles. The four experiments that will be run on the LHC are ALICE (A Large Ion Collider Experiment), ATLAS (A Toroidal LHC ApparatuS), CMS (Compact Muon Solenoid) and LHCb (Large Hadron Collider beauty). The applications team coordinate these experiments' input into the development of the Grid as they are its end users for the immediate future. They will also ensure that all the data from the detectors is being transferred and stored on the Grid.
The GridPP website has a number of pages that explain the different parts of the project. These include posters used at events,
demonstrations which showcase the state of the Grid in real time or simulate it in full operation and the
Grid Acronym Soup which contains explanations of Grid abbreviations and acronyms.
Return To The Top
What Is GridPP Doing?
The initial phase of GridPP (2001-2004) built the UK testbed, a working prototype for a Grid that was linked to other Grid testbeds around the world.
It had been used to analyse real data from experiments being run around the world in different institutions.
They included; BaBar,
CDF,
D0 at SLAC and Fermilab in the US and the
UKQCD collaboration.
Although not yet on the scale expected from the LHC, these experiments were and still are an important test of the tools and techniques of the Grid.
The Grid testbed was also used for trial runs with simulated LHC data, called data challenges.
http://www.gridpp.ac.uk/timeline/
Who And What Is Involved In GridPP?
GridPP as an organisation manages the UK's involvement in CERN's Large Hadron Collider Computing Grid project (LCG).
It oversees a Tier 1 facility at the Rutherford Appleton Laboratory (RAL) and the Tier 2 organisations of ScotGrid, NorthGrid, SouthGrid and London .
GridPP is also part of larger, interdisciplinary project, called EGI - European Grid Initiative. Funded by the EU, EGI started in May 2010 with the aim of bringing together computing Grids from different countries and disciplines. It now covers areas from geology to computational chemistry, with 30,000 CPUS at over 200 sites in 39 countries. The GridPP grid is part of the EGI network, providing computing power to tasks beyond the high energy physiocs world.
The Tier 2s and their institutions
The Tier 2 facilities are virtual: they do not exist in one location. They are institutions that work together to pool their resources.
The users:
The users, are also grouped together virtually as Virtual Organisations (VOs).
Any group working together, no matter where they are geographically, form a VO. For example, there are VOs for each of the LHC experiments.
To use the Grid a user must have a digital security certificate: this is what the middleware looks for when verifying if you are allowed to use the Grid.
Certificates are distributed by the Certification Authority, with the person in charge of a VO authorising users (or not).
Who else is using Grids?
GridPP is certainly not the only organisation working on Grids. There are other organisation which GridPP works with or is a member of. They include:
CERN's Large Hadron Collider Computing Grid project (wLCG), this project is developing a worldwide computational Grid to deal with the computing demands of the LHC.
European Grid Initiative is the European Union's main Grid project. It brings together experts from 38 countries with the common aim of developing a Grid infrastructure for international science.
Globus a US-European collaboration, conducting research and development to create fundamental technologies behind the Grid.
Open Grid Forum a community-initiated forum of thousands of individuals from industry and research developing global standards for Grid computing.
The UK e-science core programme gives details of the UK's Grid and e-science programme. It also includes links to the e-science programmes of all the UK Research Councils.
STFC's e-Science programme funds projects such as GridPP and Astrogrid.
Beyond Particle Physics
Within the UK, GridPP is also collaborating with other parts of the UK's e-science programme, such as the National Grid Service. Many of the tools developed by GridPP could be useful for other disciplines for example,
GridPP has been working with clinical researchers on the potential for using its computer security tools in the health service.
In addition, GridPP is open to opportunities to work with industry and discuss experience of current Grid development issues and solutions adopted. MOre on this work can be found on the Knowledge Exchange and Economic Impact pages
It is not even just science using Grids. For example,
Montclair State University's humanities department use a Grid called the Inferno Grid.
They are using it for exactly the same reason as scientists: collaboration is easier and they maximise resource use.
Current texts available on their system include Aristotle, Galen, Plato, Marcus Aurelius, and Commentaries.
Return To The Top
GridSite
GridPP has also developed the open source GridSite tool.
This identifies users to websites, using the same digital certificates as the Grid.
Rather than needing to remember lots of passwords when you visit a website or webpage, GridSite will identify you.
You will then have access to the sites' web-based editing interface. GridSiteWiki is an extension to the tool, that does the same job for Wikis.
Since GridSite is open source, just like the middleware, it is available for any website to use.
Return To The Top
Last modified Wed 18 August 2010 . View page history
Switch to HTTPS . Website Help . Print View . Built with GridSite 1.4.3







