Difference between revisions of "VO specific software on the Grid"

From GridPP Wiki
Jump to: navigation, search
Line 1: Line 1:
 
== DRAFT ==
 
== DRAFT ==
== Accessing software distributed via CVMFS ==
+
==Accessing software distributed via CVMFS==
 
For most VOs the software is now being distributed via CVMFS. The only detail a user (client) has to know is how the repository(-ries) are mapped on Worker Nodes.
 
For most VOs the software is now being distributed via CVMFS. The only detail a user (client) has to know is how the repository(-ries) are mapped on Worker Nodes.
 
In this article we will use the gridpp VO repository, which is mapped to <code>/cvmfs/gridpp.egi.eu/</code> . A VO software administrator uploaded a following example
 
In this article we will use the gridpp VO repository, which is mapped to <code>/cvmfs/gridpp.egi.eu/</code> . A VO software administrator uploaded a following example

Revision as of 08:43, 27 September 2017

DRAFT

Accessing software distributed via CVMFS

For most VOs the software is now being distributed via CVMFS. The only detail a user (client) has to know is how the repository(-ries) are mapped on Worker Nodes. In this article we will use the gridpp VO repository, which is mapped to /cvmfs/gridpp.egi.eu/ . A VO software administrator uploaded a following example python script and saved it as testing/hello.py :


#!/usr/bin/env python
import sys

print "----------------------"
print "Hello, I'm a snake !  /\/\/o"
print "----------------------"

print " More info:\n"

print (sys.version)

#
 

It normally takes a few hours before uploaded software becomes available to clients. Now we need to create a job wrapper (run_hello_cvmfs.sh) which will be submitted as a Dirac executable:


#!/bin/bash
#
# Run the Python script.
export GRIDPP_VO_CVMFS_ROOT=/cvmfs/gridpp.egi.eu/testing/HelloWorld
if [ -d "$GRIDPP_VO_CVMFS_ROOT" ]; then
   $GRIDPP_VO_CVMFS_ROOT/hello.py
else
   echo "Requester CVMFS directory does not exist $GRIDPP_VO_CVMFS_ROOT  "
   exit 1
fi
#


The last step is to create a Dirac jdl file (hello_cvmfs.jdl):

[
JobName = "Snake_Job_CVMFS";
Executable = "run_hello_cvmfs.sh";
Arguments = "";
StdOutput = "StdOut";
StdError = "StdErr";
InputSandbox = {"run_hello_cvmfs.sh"};
OutputSandbox = {"StdOut","StdErr"};
]

In the jdl we define the executable (run_hello_cvmfs.sh) which is shipped with the job in the input sandbox. Now we can submit our first CVMFS job:

dirac-wms-job-submit -f logfile hello_cvmfs.jdl

Check its status, which in our case returned:

dirac-wms-job-status -f logfile
JobID=5213546 Status=Running; MinorStatus=Job Initialization; Site=VAC.UKI-LT2-RHUL.uk;

When job finishes, we can grab the output (dirac-wms-job-get-output -f logfile), which reads:

----------------------
Hello, I'm a snake !  /\/\/o
----------------------
 More info:

2.7.12 (default, Dec 17 2016, 21:07:48) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-17)]

As stated above although CVMFS provides an easy access to experiment's software it is not well suited for rapid software changes. Updates are typically visible a few hours after uploading. It is best suited for distributing well tested software. In some cases however it might be necessary to apply quick patches or test different external libraries etc. One way of achieving this task is to use code versioning systems, i.e. git. We'll cover this topic in the next section.

Using Code Versioning Systems (example: git)

We'll try to use git to access hour software. This will be still the same trivial Python script as used above. Clearly trying to pull in a few GB of code, building it on every WN and submitting 1000 jobs for "test" is not a use case described here.

Our job wrapper will look like this (run_hello.py):

#!/bin/bash
#
# Get the Python script from Github:
wget https://github.com/martynia/HelloWorld/archive/master.zip
unzip master.zip
cp HelloWorld-master/hello.py .
./hello.py
#

And the jdl:

[
JobName = "Snake_Job";
Executable = "run_hello.sh";
Arguments = "";
StdOutput = "StdOut";
StdError = "StdErr";
InputSandbox = {"run_hello.sh"};
OutputSandbox = {"StdOut","StdErr"};
]

This method does not require the VO to host its own git installation, we can just get the zipped software bundle.

Alternatively we could try to use CERN CVMFS git installation, which is located (at the time of writing) at: /cvmfs/sft.cern.ch/lcg/git-2.9.3/. We would need to replace the wget line in the job wrapper above with the git invocation:

/cvmfs/sft.cern.ch/lcg/git-2.9.3/git clone https://github.com/martynia/HelloWorld.git

And submit a job in a usual way.