Difference between revisions of "Setting up a new Virtual Organisation (VO)"

From GridPP Wiki
Jump to: navigation, search
 
Line 37: Line 37:
  
 
* Atlas's rucio may help, but not yet available.  
 
* Atlas's rucio may help, but not yet available.  
 +
** Dirac may help here too
  
 
* Keep multiple copies of data you care about.
 
* Keep multiple copies of data you care about.
Line 54: Line 55:
  
 
* Register files in LFC?
 
* Register files in LFC?
 +
** Dirac has the Dirac File catalog as well
 
* Use consistent mapping from LFC name to SURL (cf CMS trivial file catalogue)
 
* Use consistent mapping from LFC name to SURL (cf CMS trivial file catalogue)
 +
** This is enforced by the Dirac File catalogue
 
* Check consistency of LFC/site data
 
* Check consistency of LFC/site data
  

Revision as of 16:09, 27 June 2014

This page is intended to become an example of how a model VO gets up and running on the grid.

You are strongly recommended to discuss this with your friendly local sysadmin


VO Setup

  • Chose a name and request the VO be created
    • This must be of the format of a DNS name e.g t2k.org
    • The VO should have control of the domain name referenced in the VO name
    • It should also reflect the scope/ownership of the VO i.e. a .ac.uk domaion would not be suitable for an international experiment
  • Request VO is enabled at sites.
    • We recommend starting with a small number of sites and get things working, then expand.

Job and data management

Bookkeeping and managing failures is what seems to cause VOs the most difficulty.


Job Management

There are several sets of software that may help manage sets of jobs.

  • Ganga
  • Dirac
  • Panda (currently atlas only)

Use Myproxy to enable jobs to renew their proxy certificate and last longer than 24h.

Privileged users for priority tasks.

Software deployment

  • CVMFS
  • Classic sgm jobs
    • Via github (see Sno+ example)

Data management

Bookeeping a major headache

  • Atlas's rucio may help, but not yet available.
    • Dirac may help here too
  • Keep multiple copies of data you care about.
    • Catastrophic failure of a site (eg fire)
    • Failure of a tape
    • Accidental deletion (by you or the site)


  • Checksum data
    • On transfer
    • Ask supporting sites to checksum files to catch silent data corruption (see CERN paper)


  • Privileged users (so normal user can't delete vital data)
  • Use FTS to transfer large sets of data (but catch failures)
  • Register files in LFC?
    • Dirac has the Dirac File catalog as well
  • Use consistent mapping from LFC name to SURL (cf CMS trivial file catalogue)
    • This is enforced by the Dirac File catalogue
  • Check consistency of LFC/site data
  • Federated access to storage via webdav may be possible in the future.

LHC like Data model

RAW data: copy at CERN (Tier-0), 2 copies at Tier-1 sites. Custodial storage on tape (but may have disk copy too).

Processed data: reprocess every few months at Tier-1 (with newer versions of software), copies at Tier-1. Distribute to Tier-2 sites for user analysis.

Simulation: Monte-carlo simulations carried out at Tier-2. Results packaged up and archived at Tier-1 and copies shipped out to Tier-2 sites.