Difference between revisions of "New VO deployment"

From GridPP Wiki
Jump to: navigation, search
(moved policies section forward; added gdpr; added some AARC guidance; added links to GridPP user guide. Lowered "completeness" because there is more aarc guidance appearing)
Line 5: Line 5:
 
A ''Virtual Organisation'' or VO is a group or collaboration with a common purpose. It may be a particular project with funding, it may be a common purpose, or it may be a specific group of people collaborating on a particular task. Typically, a VO will be able to share data with every member in the VO, and they will have resources allocated to them as a VO.  VOs typically have a lifetime of years, not days or months.
 
A ''Virtual Organisation'' or VO is a group or collaboration with a common purpose. It may be a particular project with funding, it may be a common purpose, or it may be a specific group of people collaborating on a particular task. Typically, a VO will be able to share data with every member in the VO, and they will have resources allocated to them as a VO.  VOs typically have a lifetime of years, not days or months.
  
It is also possible to join an existing VO, of course, if there is one with similar goals: see the [http://operations-portal.egi.eu/vo/search EGI VO registration portal].
+
It is also possible to [https://www.gridpp.ac.uk/userguide/getting-on-the-grid/joining-a-vo.html join an existing VO], of course, if there is one with similar goals: see the [http://operations-portal.egi.eu/vo/search EGI VO registration portal].
  
 
VOs can be local (supported at a single site), national (supported by several sites but in the same country, e.g. on a national infrastructure), and international. What follows below is generally about international VOs; for regional VOs (local/national), some of the information requested below may not be needed.
 
VOs can be local (supported at a single site), national (supported by several sites but in the same country, e.g. on a national infrastructure), and international. What follows below is generally about international VOs; for regional VOs (local/national), some of the information requested below may not be needed.
 +
 +
=== VO Policies and Security ===
 +
* The VO managers are responsible for defining the policies of the VO:
 +
** The VO typically has a specific purpose, describing which work is in scope and which is not. Users must only do work which is in scope for the VO.
 +
** VOs should define an acceptable use policy (AUP) for its members. The AUP should be compatible with those of the infrastructures that the VO will be using (e.g. GridPP)
 +
** VOs generally process personal data of their members and should therefore comply with the GDPR; see [https://aarc-project.eu/wp-content/uploads/2018/05/AARC-G042-Data-Protection-Impact-Assessment-initial-guidance-for-communities.pdf AARC-G042] for guidance.
 +
* The VO security contacts (via the mailing list required for registering the VO with EGI) are expected to deal in a timely manner with breaches of policy. An incident may lead to the errant member of the VO being banned from the infrastructure, but if the GridPP admins/coordinators don't feel the incident is being dealt with satisfactorily by the VO, the whole VO may get banned (perhaps temporarily). During a ban, no jobs can be run and no data can be accessed at any site.
 +
* If the VO has data security requirements, these should be discussed with GridPP during setup of the VO.
 +
** A typical VO using GridPP will have every file readable by everyone in the VO, but define a ''role'' of users who are allowed to create, update, and delete files.  However, more careful access control policies are possible.
 +
  
 
===Information needed===
 
===Information needed===
Line 27: Line 37:
 
* Expected size of the VO (i.e how many users).
 
* Expected size of the VO (i.e how many users).
  
===Security considerations===
 
* The VO managers are responsible for setting the aims of the VO: describe which work is in scope and which is not. Users must do only work which is in scope for the VO.
 
** VOs should define an acceptable use policy (AUP) for its members.
 
* The VO security contacts (via the mailing list required for registering the VO with EGI) are expected to deal in a timely manner with breaches of policy. An incident may lead to the errant member of the VO being banned from the infrastructure, but if the GridPP admins/coordinators don't feel the incident is being dealt with satisfactorily by the VO, the whole VO may get banned (perhaps temporarily). During a ban, no jobs can be run and no data can be accessed at any site.
 
* If the VO has data security requirements, these should be discussed with GridPP during setup of the VO.
 
** A typical VO using GridPP will have every file readable by everyone in the VO, but define a ''role'' of users who are allowed to create, update, and delete files.  However, more careful access control policies are possible.
 
  
 
===VO software===
 
===VO software===
Line 79: Line 83:
 
[[Category:VOMS]]
 
[[Category:VOMS]]
  
{{KeyDocs|responsible=Jens Jensen|reviewdate=2018-05-17|accuratedate=2017-05-17|percentage=90}}
+
{{KeyDocs|responsible=Jens Jensen|reviewdate=2018-11-05|accuratedate=2018-11-05|percentage=80}}

Revision as of 16:23, 5 November 2018

This page covers both the creation and deployment of new VOs. Its target audience is the technical and support contacts of the new VO, and GridPP infrastructure sysadmins.

Creating a New VO

A Virtual Organisation or VO is a group or collaboration with a common purpose. It may be a particular project with funding, it may be a common purpose, or it may be a specific group of people collaborating on a particular task. Typically, a VO will be able to share data with every member in the VO, and they will have resources allocated to them as a VO. VOs typically have a lifetime of years, not days or months.

It is also possible to join an existing VO, of course, if there is one with similar goals: see the EGI VO registration portal.

VOs can be local (supported at a single site), national (supported by several sites but in the same country, e.g. on a national infrastructure), and international. What follows below is generally about international VOs; for regional VOs (local/national), some of the information requested below may not be needed.

VO Policies and Security

  • The VO managers are responsible for defining the policies of the VO:
    • The VO typically has a specific purpose, describing which work is in scope and which is not. Users must only do work which is in scope for the VO.
    • VOs should define an acceptable use policy (AUP) for its members. The AUP should be compatible with those of the infrastructures that the VO will be using (e.g. GridPP)
    • VOs generally process personal data of their members and should therefore comply with the GDPR; see AARC-G042 for guidance.
  • The VO security contacts (via the mailing list required for registering the VO with EGI) are expected to deal in a timely manner with breaches of policy. An incident may lead to the errant member of the VO being banned from the infrastructure, but if the GridPP admins/coordinators don't feel the incident is being dealt with satisfactorily by the VO, the whole VO may get banned (perhaps temporarily). During a ban, no jobs can be run and no data can be accessed at any site.
  • If the VO has data security requirements, these should be discussed with GridPP during setup of the VO.
    • A typical VO using GridPP will have every file readable by everyone in the VO, but define a role of users who are allowed to create, update, and delete files. However, more careful access control policies are possible.


Information needed

The VO will need to provide some information, partly for practical and security reasons and partly to help system administrators estimate what resources the VO will be likely to need.

  • Name of the VO. This should be reasonably short, distinctive, and must not clash with any existing VO. A VO will typically have two names, a short name (usually lower case), say "poohsticks" (an experiment running poohsticks simulations), and a DNS style name, such as vo.poohsticks.org (assuming they own the DNS domain poohsticks.org.)
  • VO management: a VO manager (who can decide membership of VO, roles and responsibilities), plus usually at least one deputy or co-manager. You will need an email address for the management contact(s), and it is recommended to use a mailing list so people can be added and messages are archived (e.g. poohsticks-management@example.com)
  • VO support contacts (see below) - you can choose to register support people with the helpdesks; this is recommended for larger VOs or for VOs that have VO-specific software on the grid.
  • Security contacts - ideally at least two people who can respond quickly in the event of a security incident relating to a member of the VO, or to the VO as a whole. This also needs to have a mailing list.
  • A VOMS server. The VOMS server manages the VO's memberships and the members' roles and subgroups. If the VO is not already hosted on one of the existing VOMS servers, it is recommended to request use of GridPP's - for this, you will typically need authorisation from the prinicpal investigator (PI).
    • VO members including the managers will usually need personal X.509 certificates from an IGTF CA.
      • However, one of these is RCauth which can generate (lower assurance) certificates based on federated identities.
      • Increasingly many research infrastructures provide (web) authentication through portals with federated identity management, meaning users can use federated login to authenticate to the infrastructure. However, in VOMS, roles etc are usually assigned to a distinguished name, meaning privileged users without certificates may need credential conversion (ie. an online CA) to create certificates for them.
    • Optionally, roles and/or subgroups of the VO can be defined - see Security Considerations below. These are set up by the VO manager once they are authorised on the VOMS server.
  • Hardware requirements - memory size, disk space etc, types of storage required (working repositories, archives, databases)
  • Software requirements - any software beyond the basic Linux tools/libraries, including things which are part of standard distributions as they may not be installed by default.
  • Typical usage pattern - expected job frequency and variation over time, job length, data read and written per job etc.
  • General procedures - for example if the site has to request the installation of VO-specific software.
  • Expected size of the VO (i.e how many users).


VO software

Each VO will typically need some VO-specific services. Some of these are grid-based, such as file catalogues, resource brokers, compute and storage elements. There may be additional software required by the VO, and there will be means of making this available.

There are various models for dealing with the installation of VO-specific software. If only a few dedicated sites are involved the software can be directly installed by the administrators. If the software is relatively compact it can be shipped with the job in the sandbox, or downloaded from a Storage Element or a web site.

RAL Tier1 uses CVMFS to manage software for different VOs, particularly when they conflict (e.g. requiring different versions of python or libraries, etc.) Alternatively ask the admins on the sites that support your VO what they provide.

Support procedures

  • VOs should be prepared to support their users at least in the use of VO-specific software.
  • VO support liaison should sign up for a mailing list called GRIDPP-SUPPORT.
  • It is strongly recommended to make GridPP technical support contacts members of the VO, at least initially, in order to help shoot any trouble that might arise.

The standard support route for all Grid users is the GGUS portal, as described here. For regional (e.g. GridPP-specific) VOs the tickets will generally be directed back to the UK Grid helpdesk. There is also a GridPP Users mailing list for support/usage discussions.

The EGI VO Registration Form can be found here. It also provides a list of documents you should consult before creating a new VO.

The general procedure is sketched out in the section Instruction for VO administrators. The process is still under development, and anyone wishing to create a new VO should contact the GridPP for further help and information.


VO Deployment

It is recommended that a VO initially keep all its data at a single site, even if jobs are running at several sites.

It is recommendend to ask for, and use, a "space token" to read and write data (at all sites that host the VO's data).

Once a VO is set up, sites should be requested to support it - VOs typically do this via their designated GridPP support contact(s). At this point the VO should have:

  • Acceptable use policies defined
  • A VOMS server with VO information
  • Mailing lists
  • Resources allocated to it - CPU hours, disk storage, tape storage

Sites will then need to:

  • Add the VO to the list of supported VOs (e.g. gridmap files)
  • Add the VO to the UIs, if applicable (so users can do voms-proxy-init locally)
  • Allocate the resources to the VO, e.g. space tokens, disk pools

It is recommended that the VO attends the following meetings:

  • Tier 1 liaison meeting (if using Tier 1) - VOs should attend regularly

Once they are using GridPP resources, VOs are also welcome to attend

This page is a Key Document, and is the responsibility of Jens Jensen. It was last reviewed on 2018-11-05 when it was considered to be 80% complete. It was last judged to be accurate on 2018-11-05.