Using Castor At RAL

From GridPP Wiki
Jump to: navigation, search

Introduction

This page describes how to use the General purpose (GEN) instance at RAL. This is a shared storage element for all the non-LHC VOs supported by the GridPP collaboration. This page was written in January 2013. The information should be accurate until the end of 2014. If you have problems with any of the instructions on here please contact support@gridpp.rl.ac.uk.

When setting up the General purpose instance at RAL we assumed that non-LHC VOs would not have a great deal of experience with Grid computing, the service is therefore designed to do as much for you as possible. We assumed the following use cases:

  • The VO may not know how much space they need.
  • The VO may not need to use their data for long periods.
  • The VO is unlikely to have the man power to effectively deal with data cleanup and dealing with problems.


Site Setup

The GEN instance has two separate service classes called GenTape and GenScratch.

  • GenTape is a D0T1 service class. It consists of a 300TB disk buffer in front of the tape service. When a file is written to GenTape it is placed on the disk buffer. Within an hour an additional copy of the file will be copied to tape. As more files are written, the disk buffer will slowly fill up. Periodically Castor will automatically remove the least recently used (LRU) files from the disk buffer. It currently takes around 6 months of inactivity for a file to be removed from the disk buffer. If you need a file that has been removed from the disk buffer it will have to be recalled from Tape. This can take a few minutes. However given that the file will have been unused for several months it should not matter if a job takes a few more minutes to run because it needs to wait for the files to be retrieved from tape.


  • GenScratch is D0T0 service class. It consists of a 200TB disk pool which will automatically clean itself up when it gets full. You can think of it like a very large grid accessible /tmp space. The cleanup works on LRU. No guarantees can be made about how long files will spend on this disk although it currently takes several months before files are deleted. Note, if you register your files in a file catalogue such as the LFC, the files will not be automatically removed from there, which could lead to consistency problems if a VO is not careful. All (non-LHC) VOs that can run jobs at RAL can use this space.


Using Castor

The location of all files stored in Castor are stored in the NameServer. Files that are to be put in GenTape should be written to:

/castor/ads.rl.ac.uk/prod/<VO name>/tape

while those in GenScratch go to:

/castor/ads.rl.ac.uk/prod/<VO name>/disk

You can create any higher level directory structure you like to organize your files.

Reading/Writing files

The most straight forward way to read and write files is with the lcg-cp command. You will need to use a machine with the lcg commands installed on, normally referred to as a UI (Speak to your sites sysadmin if you need this setup). You also need to have a valid certificate and to have setup a proxy.

$ voms-proxy-init -voms <VO name>

Then to copy a file from Castor to your local machine:

$ lcg-cp --vo <VO name> srm://srm-gen.gridpp.rl.ac.uk/castor/ads.rl.ac.uk/prod/<VO name>/disk/my/directory/structure/testfile /tmp/testfile

To copy a file from your local machine to Castor:

$ lcg-cp --vo <VO name> /tmp/testfile srm://srm-gen.gridpp.rl.ac.uk/castor/ads.rl.ac.uk/prod/<VO name>/disk/my/directory/structure/testfile

The lcg-cp command will automatically create whatever directory structure you want so there is no need to make directories. Full documentation for the lcg- commands can be found here. Other useful commands include:

  • lcg-del for removing files! You don't need to do this for genScratch.
  • lcg-cr for registering files to the LFC if you are using one.
  • lcg-bringonline if you want to access a tape backed file that hasn't been used in a while you can submit this command to get the file copied back to disk. Useful if you are likely to analysis some older data in the near future.

Permissions

Castor has a file that maps grid proxies to Castor users. The most basic/default setup is to map all users from a certain VO to the same castor user. This will allow anybody within that VO to read/write data at RAL. You can of course request that certain directories have restricted permissions for certain users/groups. We can customize this to your requirements.

Checking on Usage

The only space that VOs need to worry about is the amount that they are using on TAPE. The disk buffers automatically look after themselves and any data stored on them is not permanent. Disk usage is therefore not counted against VO usage.

To check the amount of space your VO has used you can use the following command (for example for T2K):

$ lcg-infosites -f RAL-LCG2 --vo t2k.org space
    Free     Used Reserved     Free     Used Reserved Tag                    SE
  Online   Online   Online Nearline Nearline Nearline                        
-------------------------------------------------------------------------------
    5184    12096    17280    85991   335687   507450 -                      srm-t2k.gridpp.rl.ac.uk
    5184    12096    17280    85991   335687   507450 T2KORGDISK             srm-t2k.gridpp.rl.ac.uk

The important value is the Used Nearline which is in GB. This number is also shown on the RAL status page under Storage Used (GB). The other numbers are misleading as space will be allocated as required and does not count against your usage.

Transfers between sites (FTS)

If you wish to perform regular or bulk transfers between grid sites then you should use the File Transfer Service (FTS). A good example of this is exporting your VO's data from the site where it is created to sites around the world for backup and analysis. There are several FTS services in production around the world. RAL runs an FTS service and this controls all transfers to the UK from abroad (with the exception of CERN) and all transfers within the UK. More information about the FTS can be found here. If you wish to use the FTS you will need to contact support@gridpp.rl.ac.uk and provide:

  • The name of your VO
  • The names of the grid sites you want to transfer between.
  • An estimate on the amount of data you might send.

RAL can then configure the FTS to manage transfers between these sites. You can find instructions for submitting transfers to the FTS here.

Glossary

  • Service Class This is the name Castor gives to a block of storage. It is very similar to Space Tokens that are used by other types of storage elements. Different Service Classes are able to store data in different ways.
  • DxTy This describes the number of copies of data that will be kept permanently in a service class. For example D1T0 means 1 copy of the data will be kept on disk and no copies will be kept on tape. At times there might be 2 copies of the data on disk (for example if we were moving the data to newer hardware) however only 1 copy of the data is guaranteed.
  • kB 1000B. (Units are in 10^3 rather than 2^10)
  • NameServer This is a central directory structure that keeps a record of all the files that are stored in Castor.