Difference between revisions of "New VO deployment"

From GridPP Wiki
Jump to: navigation, search
m (started fixing Tom's b0rken link)
(12 intermediate revisions by 3 users not shown)
Line 1: Line 1:
 +
This page covers both the ''creation'' and ''deployment'' of new VOs. Its target audience is the technical and support contacts of the new VO, and GridPP infrastructure sysadmins.
 +
 
== Creating a New VO ==
 
== Creating a New VO ==
  
The general procedure is sketched out in the section [[Instruction for VO administrators]]. The process is still under development, and anyone wishing to create a new VO should contact the [https://www.gridpp.ac.uk/deployment/contact.html Deployment Team] for further help and information (in particular the Production Manager, Security Officer and VOMS Manager).
+
A ''Virtual Organisation'' or VO is a group or collaboration with a common purpose. It may be a particular project with funding, it may be a common purpose, or it may be a specific group of people collaborating on a particular task. Typically, a VO will be able to share data with every member in the VO, and they will have resources allocated to them as a VO. VOs typically have a lifetime of years, not days or months.
  
It is also possible to join an existing VO, of course, if there is one with similar goals: see the [http://operations-portal.egi.eu/vo/registrationWelcome EGI VO registration portal], and in particular the [http://operations-portal.egi.eu/vo/search list of existing VOs] (click the + on the left to see the VO information - a new VO would need to provide the same information.)
+
It is also possible to join an existing VO, of course, if there is one with similar goals: see the [http://operations-portal.egi.eu/vo/search EGI VO registration portal].
  
What follows below is general information that you should be aware of if you want to start a VO.  
+
VOs can be local (supported at a single site), national (supported by several sites but in the same country, e.g. on a national infrastructure), and international. What follows below is generally about international VOs; for regional VOs (local/national), some of the information requested below may not be needed.
  
 
===Information needed===
 
===Information needed===
The VO will need to provide some information, partly for security reasons and partly to let system administrators judge what resources the VO will be likely to need. Useful information would include:
+
The VO will need to provide some information, partly for practical and security reasons and partly to help system administrators estimate what resources the VO will be likely to need.  
* VO name. This should be reasonably short, distinctive, and must not clash with any existing VO. A lower-case name is recommended, and generally no more than five or six characters (letters and numbers are allowed in the name, but most other characters are not). There is a recommendation to base VO names on DNS names to avoid name clashes, so that  GridPP VOs should have names like xxx.gridpp.ac.uk.
+
 
* VO support contacts - both specific responsible people and various experiment mailing lists.
+
* Name of the VO. This should be reasonably short, distinctive, and must not clash with any existing VO. A VO will typically have two names, a short name (usually lower case), say "poohsticks" (an experiment running poohsticks simulations), and a DNS style name, such as vo.poohsticks.org (assuming they own the DNS domain poohsticks.org.)
* Security contacts - ideally at least two people who can respond quickly in the event of a security incident relating to a member of the VO, or to the VO as a whole.
+
* VO management: a VO manager (who can decide membership of VO, roles and responsibilities), plus usually at least one deputy or co-manager. You will need an email address for the management contact(s), and it is recommended to use a mailing list so people can be added and messages are archived (e.g. poohsticks-management@example.com)
* VO/VOMS server, file catalogue etc. end-points (see below).
+
* VO support contacts (see below) - you can choose to register support people with the helpdesks; this is recommended for larger VOs or for VOs that have VO-specific software on the grid.
* Hardware requirements - memory size, disk space etc.
+
* Security contacts - ideally at least two people who can respond quickly in the event of a security incident relating to a member of the VO, or to the VO as a whole. This also needs to have a mailing list.
 +
* A VOMS server. The VOMS server manages the VO's memberships and the members' roles and subgroups. If the VO is not already hosted on one of the existing VOMS servers, it is recommended to request use of GridPP's - for this, you will typically need ''authorisation'' from the prinicpal investigator (PI).
 +
** VO members including the managers will usually need personal X.509 certificates from an [https://www.igtf.net/ IGTF CA].
 +
*** However, one of these is [https://rcauth.eu/ RCauth] which can generate (lower assurance) certificates based on federated identities.
 +
*** Increasingly many research infrastructures provide (web) authentication through portals with federated identity management, meaning users can use federated login to authenticate to the infrastructure.  However, in VOMS, roles etc are usually assigned to a distinguished name, meaning privileged users without certificates may need credential conversion (ie. an online CA) to create certificates for them.
 +
** Optionally, roles and/or subgroups of the VO can be defined - see Security Considerations below. These are set up by the VO manager once they are authorised on the VOMS server.
 +
* Hardware requirements - memory size, disk space etc, types of storage required (working repositories, archives, databases)
 
* Software requirements - any software beyond the basic Linux tools/libraries, including things which are part of standard distributions as they may not be installed by default.
 
* Software requirements - any software beyond the basic Linux tools/libraries, including things which are part of standard distributions as they may not be installed by default.
 
* Typical usage pattern - expected job frequency and variation over time, job length, data read and written per job etc.
 
* Typical usage pattern - expected job frequency and variation over time, job length, data read and written per job etc.
* Glue schema fields used - this would give an idea of what is really used in the information system and needs to be ensured to be properly set and maintained.
+
* General procedures - for example if the site has to request the installation of VO-specific software.
* General procedures - for example if the site has to request the installation of VO software.
+
* Expected size of the VO (i.e how many users).
* Size of the VO (i.e how many users), to give a guide to how many pool accounts to create.  
+
  
See the [http://www.phenogrid.dur.ac.uk/howto/config Phenogrid] web site for an example of the sort of thing required. You can also have a look at a [https://www.gridpp.ac.uk/deployment/users/questionnaire.html questionnaire] which EGEE has used to start discussions with new VOs.
+
===Security considerations===
 +
* The VO managers are responsible for setting the aims of the VO: describe which work is in scope and which is not. Users must do only work which is in scope for the VO.
 +
** VOs should define an acceptable use policy (AUP) for its members.
 +
* The VO security contacts (via the mailing list required for registering the VO with EGI) are expected to deal in a timely manner with breaches of policy. An incident may lead to the errant member of the VO being banned from the infrastructure, but if the GridPP admins/coordinators don't feel the incident is being dealt with satisfactorily by the VO, the whole VO may get banned (perhaps temporarily). During a ban, no jobs can be run and no data can be accessed at any site.
 +
* If the VO has data security requirements, these should be discussed with GridPP during setup of the VO.
 +
** A typical VO using GridPP will have every file readable by everyone in the VO, but define a ''role'' of users who are allowed to create, update, and delete files.  However, more careful access control policies are possible.
  
The EGEE operations group has developed a standardised [http://operations-portal.egi.eu/aboutportal/map VO ID card] to provide this kind of information. Most of the entries are well explained.
+
===VO software===
 +
Each VO will typically need some VO-specific services. Some of these are grid-based, such as file catalogues, resource brokers, compute and storage elements. There may be additional software required by the VO, and there will be means of making this available.
  
===Security considerations===
+
There are various models for dealing with the installation of VO-specific software. If only a few dedicated sites are involved the software can be directly installed by the administrators. If the software is relatively compact it can be shipped with the job in the sandbox, or downloaded from a Storage Element or a web site.
The VO will need to provide administrators who take responsibility for adding users into the VO, checking that they understand their responsibilities, and if necessary removing them from the VO if they abuse the system. VOs should define what constitutes acceptable use for their members (in addition to the general acceptable use policies applicable to all grid users).
+
  
Some of the [https://www.gridpp.ac.uk/deployment/security/policies/index.html security policy documents] are relevant to VO creation and operation, and the VO administrators need to ensure that they comply with the relevant policies.
+
[[RAL Tier1]] uses [[RAL Tier1|CVMFS]] to manage software for different VOs, particularly when they conflict (e.g. requiring different versions of python or libraries, etc.) Alternatively ask the admins on the sites that support your VO what they provide.
  
===VO services===
+
===Support procedures===
Each VO will need some VO-specific services. At a minimum you need a VO/VOMS server to store the list of VO users, but file catalogues, resource brokers and perhaps other services may also be needed. These may be run by the VO itself or, by negotiation, as part of the general GridPP infrastructure. In particular a [https://voms.gridpp.ac.uk:8443/vomses/ GridPP VOMS server] is run by Manchester for the use of the GridPP community; [https://www.gridpp.ac.uk/deployment/contact.html contact] the VOMS manager for more information.
+
* VOs should be prepared to support their users at least in the use of VO-specific software.
 +
* VO support liaison should sign up for a mailing list called [http://www.jiscmail.ac.uk/lists/GRIDPP-SUPPORT.html GRIDPP-SUPPORT].
 +
* It is strongly recommended to make GridPP technical support contacts members of the VO, at least initially, in order to help shoot any trouble that might arise.
  
===Getting the VO enabled at sites===
+
The standard support route for all Grid users is the GGUS portal, as described [http://www.gridpp.ac.uk/deployment/users/support.html here]. For regional (e.g. GridPP-specific) VOs the tickets will generally be directed back to the UK Grid helpdesk. There is also a [http://www.jiscmail.ac.uk/lists/GRIDPP-USERS.html GridPP Users] mailing list for support/usage discussions.
Enabling a VO is a relatively easy process, and sites which are directly associated with the VO (including sites in other countries) should be able to do it given the information described above. To get further resources from other GridPP sites, contact the [https://www.gridpp.ac.uk/deployment/contact.html Deployment Team].
+
  
===VO software installation===
+
The [http://egi.eu EGI] [[Virtual Organisation|VO]] Registration Form can be found [https://operations-portal.egi.eu/vo/ here]. It also provides a list of documents you should consult before [[Start Here - Creating a new VO|creating a new VO]].
There are various models for dealing with the installation of VO-specific software. If only a few dedicated sites are involved the software can be directly installed by the administrators. If the software is relatively compact it can be shipped with the job in the sandbox, or downloaded from a Storage Element or a web site.
+
  
There is also a more general [http://grid-deployment.web.cern.ch/grid-deployment/eis/docs/ExpSwInstall/sw-install.html method to install software] in VO-specific disk space visible from the Worker Nodes.
+
The general procedure is sketched out in the section [[Instruction for VO administrators]]. The process is still under development, and anyone wishing to create a new VO should contact the [https://www.gridpp.ac.uk/contact/ GridPP] for further help and information.
  
===Support procedures===
 
To some extent these are still in development. VOs should be prepared to support their users at least in the use of VO-specific software. More general Grid support will be provided by GridPP as a whole, including community support by users themselves.
 
  
The standard support route for all Grid users is the GGUS portal, as described [http://www.gridpp.ac.uk/deployment/users/support.html here]. For regional (e.g. GridPP-specific) VOs the tickets will generally be directed back to the UK Grid helpdesk. There is also a [mailto:GRIDPP-USERS@JISCMAIL.AC.UK GridPP Users] mailing list (see the [http://www.jiscmail.ac.uk/lists/GRIDPP-USERS.html JISCmail] web site for subscription information).
+
== VO Deployment ==
 +
 
 +
It is recommended that a VO initially keep all its data at a single site, even if jobs are running at several sites.
 +
 
 +
It is recommendend to ask for, and use, a [[Storage/SpaceTokens|"space token"]] to read and write data (at all sites that host the VO's data).
 +
 
 +
Once a VO is set up, sites should be requested to support it - VOs typically do this via their designated GridPP support contact(s). At this point the VO should have:
 +
* Acceptable use policies defined
 +
* A VOMS server with [[Instruction for VO administrators|VO information]]
 +
* Mailing lists
 +
* Resources allocated to it - CPU hours, disk storage, tape storage
 +
 
 +
Sites will then need to:
 +
* Add the VO to the list of supported VOs (e.g. gridmap files)
 +
* Add the VO to the UIs, if applicable (so users can do voms-proxy-init locally)
 +
* Allocate the resources to the VO, e.g. space tokens, disk pools
 +
 
 +
It is recommended that the VO attends the following meetings:
 +
* Tier 1 liaison meeting (if using Tier 1) - VOs should attend regularly
 +
 
 +
Once they are using GridPP resources, VOs are also welcome to attend
 +
* [https://indico.cern.ch/category/5136/ GridPP collaboration meetings]
 +
* [[Grid Storage]] meetings
 +
 
 
[[Category:VOMS]]
 
[[Category:VOMS]]
  
{{KeyDocs|responsible=Jens Jensen|reviewdate=2014-10-16|accuratedate=2014-10-16|percentage=80}}
+
{{KeyDocs|responsible=Jens Jensen|reviewdate=2018-05-17|accuratedate=2017-05-17|percentage=90}}

Revision as of 17:06, 17 May 2018

This page covers both the creation and deployment of new VOs. Its target audience is the technical and support contacts of the new VO, and GridPP infrastructure sysadmins.

Creating a New VO

A Virtual Organisation or VO is a group or collaboration with a common purpose. It may be a particular project with funding, it may be a common purpose, or it may be a specific group of people collaborating on a particular task. Typically, a VO will be able to share data with every member in the VO, and they will have resources allocated to them as a VO. VOs typically have a lifetime of years, not days or months.

It is also possible to join an existing VO, of course, if there is one with similar goals: see the EGI VO registration portal.

VOs can be local (supported at a single site), national (supported by several sites but in the same country, e.g. on a national infrastructure), and international. What follows below is generally about international VOs; for regional VOs (local/national), some of the information requested below may not be needed.

Information needed

The VO will need to provide some information, partly for practical and security reasons and partly to help system administrators estimate what resources the VO will be likely to need.

  • Name of the VO. This should be reasonably short, distinctive, and must not clash with any existing VO. A VO will typically have two names, a short name (usually lower case), say "poohsticks" (an experiment running poohsticks simulations), and a DNS style name, such as vo.poohsticks.org (assuming they own the DNS domain poohsticks.org.)
  • VO management: a VO manager (who can decide membership of VO, roles and responsibilities), plus usually at least one deputy or co-manager. You will need an email address for the management contact(s), and it is recommended to use a mailing list so people can be added and messages are archived (e.g. poohsticks-management@example.com)
  • VO support contacts (see below) - you can choose to register support people with the helpdesks; this is recommended for larger VOs or for VOs that have VO-specific software on the grid.
  • Security contacts - ideally at least two people who can respond quickly in the event of a security incident relating to a member of the VO, or to the VO as a whole. This also needs to have a mailing list.
  • A VOMS server. The VOMS server manages the VO's memberships and the members' roles and subgroups. If the VO is not already hosted on one of the existing VOMS servers, it is recommended to request use of GridPP's - for this, you will typically need authorisation from the prinicpal investigator (PI).
    • VO members including the managers will usually need personal X.509 certificates from an IGTF CA.
      • However, one of these is RCauth which can generate (lower assurance) certificates based on federated identities.
      • Increasingly many research infrastructures provide (web) authentication through portals with federated identity management, meaning users can use federated login to authenticate to the infrastructure. However, in VOMS, roles etc are usually assigned to a distinguished name, meaning privileged users without certificates may need credential conversion (ie. an online CA) to create certificates for them.
    • Optionally, roles and/or subgroups of the VO can be defined - see Security Considerations below. These are set up by the VO manager once they are authorised on the VOMS server.
  • Hardware requirements - memory size, disk space etc, types of storage required (working repositories, archives, databases)
  • Software requirements - any software beyond the basic Linux tools/libraries, including things which are part of standard distributions as they may not be installed by default.
  • Typical usage pattern - expected job frequency and variation over time, job length, data read and written per job etc.
  • General procedures - for example if the site has to request the installation of VO-specific software.
  • Expected size of the VO (i.e how many users).

Security considerations

  • The VO managers are responsible for setting the aims of the VO: describe which work is in scope and which is not. Users must do only work which is in scope for the VO.
    • VOs should define an acceptable use policy (AUP) for its members.
  • The VO security contacts (via the mailing list required for registering the VO with EGI) are expected to deal in a timely manner with breaches of policy. An incident may lead to the errant member of the VO being banned from the infrastructure, but if the GridPP admins/coordinators don't feel the incident is being dealt with satisfactorily by the VO, the whole VO may get banned (perhaps temporarily). During a ban, no jobs can be run and no data can be accessed at any site.
  • If the VO has data security requirements, these should be discussed with GridPP during setup of the VO.
    • A typical VO using GridPP will have every file readable by everyone in the VO, but define a role of users who are allowed to create, update, and delete files. However, more careful access control policies are possible.

VO software

Each VO will typically need some VO-specific services. Some of these are grid-based, such as file catalogues, resource brokers, compute and storage elements. There may be additional software required by the VO, and there will be means of making this available.

There are various models for dealing with the installation of VO-specific software. If only a few dedicated sites are involved the software can be directly installed by the administrators. If the software is relatively compact it can be shipped with the job in the sandbox, or downloaded from a Storage Element or a web site.

RAL Tier1 uses CVMFS to manage software for different VOs, particularly when they conflict (e.g. requiring different versions of python or libraries, etc.) Alternatively ask the admins on the sites that support your VO what they provide.

Support procedures

  • VOs should be prepared to support their users at least in the use of VO-specific software.
  • VO support liaison should sign up for a mailing list called GRIDPP-SUPPORT.
  • It is strongly recommended to make GridPP technical support contacts members of the VO, at least initially, in order to help shoot any trouble that might arise.

The standard support route for all Grid users is the GGUS portal, as described here. For regional (e.g. GridPP-specific) VOs the tickets will generally be directed back to the UK Grid helpdesk. There is also a GridPP Users mailing list for support/usage discussions.

The EGI VO Registration Form can be found here. It also provides a list of documents you should consult before creating a new VO.

The general procedure is sketched out in the section Instruction for VO administrators. The process is still under development, and anyone wishing to create a new VO should contact the GridPP for further help and information.


VO Deployment

It is recommended that a VO initially keep all its data at a single site, even if jobs are running at several sites.

It is recommendend to ask for, and use, a "space token" to read and write data (at all sites that host the VO's data).

Once a VO is set up, sites should be requested to support it - VOs typically do this via their designated GridPP support contact(s). At this point the VO should have:

  • Acceptable use policies defined
  • A VOMS server with VO information
  • Mailing lists
  • Resources allocated to it - CPU hours, disk storage, tape storage

Sites will then need to:

  • Add the VO to the list of supported VOs (e.g. gridmap files)
  • Add the VO to the UIs, if applicable (so users can do voms-proxy-init locally)
  • Allocate the resources to the VO, e.g. space tokens, disk pools

It is recommended that the VO attends the following meetings:

  • Tier 1 liaison meeting (if using Tier 1) - VOs should attend regularly

Once they are using GridPP resources, VOs are also welcome to attend

This page is a Key Document, and is the responsibility of Jens Jensen. It was last reviewed on 2018-05-17 when it was considered to be 90% complete. It was last judged to be accurate on 2017-05-17.