Basic Certificate and Proxy Concepts
Private and Public Keys
Grid security is based on the concept of public key encryption. Each user (or other entity like a server) has a private key, generated randomly. This is a number which can be used as a secret password to prove your identity. The private key must therefore be kept completely secure; if someone can steal it they can impersonate you completely.Each private key is mathematically related to another number called the public key. As the name suggests this can be known to everyone. Formally it's possible to calculate the private key given the public key, but in practice such a calculation is expected to take an unfeasibly long time (the time grows exponentially with the size of the keys).
Encryption
The keys are used in an encryption algorithm, i.e. a mathematical function which can be applied to any data. The algorithm has the property that data encrypted using the private key can be decrypted with the public key, and vice versa.Among other things this can be used to prove identity: imagine that Ada knows Ben's public key. Ada chooses a random piece of data, encrypts it with the public key and sends it to Ben. Ben decrypts it with the private key and sends it back to Ada. If it matches the number Ada first thought of it proves that Ben does indeed have the right private key.
Signing
Private keys can also be used to sign a piece of data. This involves another mathematical function called a hash function. This is something which can be applied to data of any length, and produces a fixed-length number which is characteristic of the input data, like a digital fingerprint - in particular even a tiny change to the input would produce a completely different hash. It should also be difficult (i.e. take a very large amount of computer power) to find any data at all which would produce a given hash.To sign a piece of data you calculate a hash from it, and then encrypt the hash with your private key and attach the result to the data. Anyone else can then decrypt the hash with your public key, and compare it with one they calculate themselves. If the two hashes match they know two things: that the data was signed by someone who had the private key corresponding to that public key, and that the data has not been modified since it was signed.
Certificates
To be useful, the public key has to be connected to some information about who the user (or server) is. This is stored in a specific format known as an X-509 certificate (X-509 being the name of the standard which specifies the format).The most important thing in the certificate is the Subject Name (SN), which is something which looks like /C=UK/O=eScience/OU=CLRC/L=RAL/CN=john smith. This is an example of a more general format called a Distinguished Name (DN), which appears quite a lot in the Grid world. The idea is that a DN should uniquely identify the thing it names. Unfortunately the details of how to construct a DN have never been established as an international standard, but at least within the Grid you can assume that a DN is a unique name, and the SN in your certificate is your name as far as the Grid is concerned.
A certificate also contains some other information, in particular an expiry date after which the certificate is no longer valid. User certificates are normally issued with an expiry date one year ahead, and you have to renew them before they expire. A renewed certificate will normally have new public and private keys, but you will usually keep the same SN. In some circumstances, e.g. if your private key is stolen, a certificate may be revoked, i.e. added to a known list of certificates which should be considered invalid.
Certification Authorities
Certificates are issued by a Certification Authority (CA). There are commercial CAs, of which the best-known is Verisign, but for Grid use the UK runs its own CA based at the Rutherford Appleton Laboratory. The CA follows some defined procedures to make sure that it knows who you are and that you are entitled to have a certificate.To allow people to verify the information in the certificate, the CA signs it with its own private key. To use this information, anyone who wants to check the validity of a certificate needs to know the public key of the CA, and the CA therefore has a certificate of its own. Potentially this could create an infinite regression, but this is prevented by the fact that CA certificates (known as root certificates) are self-signed, i.e. the CA signs its own certificate. These root certificates are then distributed in some secure way, which in the Grid is typically as Linux RPMs from a trusted repository. (The root certificates of many commercial CAs are pre-installed in Windows.)
Proxies
If you interact directly with a remote service you can use your certificate to prove your identity. However, in the Grid world you often want a remote service to act on your behalf, e.g. a job running on some remote site needs to be able to talk to other servers to transfer files, and it therefore needs to prove that it's entitled to use your identity. On the other hand, since your private key is so vital you don't want to send it to remote machines which might be insecure.The solution is the use of something called a proxy. Strictly speaking a proxy is also a certificate, but usually the unqualified term "certificate" is reserved for something issued by a CA. To create a proxy you create a new, temporary public/private key pair, build a new certificate containing the public key with an SN like /C=UK/O=eScience/OU=CLRC/L=RAL/CN=john smith/CN=proxy, and sign it with your long-term private key. Proxies normally have a rather short lifetime, typically 12 hours. Note that this is a purely local process, there is no contact with any remote service.
When you submit a job you send the proxy certificate, the private key for the proxy and your normal certificate. When the job wants to prove its identity to another service it sends it the proxy certificate and the standard certificate, but (usually) not the proxy private key. It can then use the chain of certificates to prove that it is entitled to use your SN. In some circumstances a job may even create a new proxy itself, so the chain can potentially be longer.
In security terms a proxy is a compromise. Since the private key is sent with it anyone who steals it can impersonate you, so proxies still need to be treated carefully. Also there is no mechanism for revoking proxies, so in general even if you know that one has been stolen there is little you can do to stop it being used. On the other hand, proxies usually have a lifetime of only a few hours so the potential damage is fairly limited.
VOMS Proxies
A new system called VOMS (VO Management Service) is being introduced to manage information about the roles and privileges of users within a VO. This information is presented to services via an extension to your proxy. At the time you create the proxy you contact one or more VOMS servers, and they return a mini certificate known as an Attribute Certificate (AC) which is signed by the VO and contains information about your group membership and any roles you may have in the VO.When you create your proxy you embed the ACs in it and sign the whole thing with your private key. Services can then decode the VOMS information and use it as required, e.g. you may only be allowed to do something if you have a particular role from a specific VO.
One thing to be aware of is that each AC has its own lifetime. This is typically 12 hours as for the proxy, but it is possible for ACs to expire at different times to each other and to the proxy as a whole.
Last modified Thu 1 February 2007 . View page history