GPU Support

From GridPP Wiki
Jump to: navigation, search
  • We have recently added some grid nodes with Nvidia GA100 GPUs at UKI-LT2-IC-HEP.
  • I am not sure if GPUs are available at other grid sites (other than QMUL), or have been tested or used much there.
  • These are very much "experimental" at the moment and their use has not been well tested. You should have a very good understanding of how your GPU code works before trying to run it on the grid.
  • The worker nodes only have a minimal software stack installed. Your job environment will need to provide Cuda support using something like Anaconda, or perhaps by means of a container image (currently untested).
  • If you require support please email lcg-site-admin at imperial.ac.uk.

Anaconda Example Using DIRAC

The following example is based on the Anaconda python distribution and some familiarity with this is probably desirable. Through Anaconda we can obtain "cudatoolkit" which provides support for the GPU and "numba" which is python library that you can use to make use of the GPU.

[
JobName = "gpu_test";
Executable = "gpu_test.sh";
Arguments = "";
StdOutput = "StdOut";
StdError = "StdErr";
InputSandbox = {"gpu_test.sh","gpu_test.py","LFN:/gridpp/user/d/dan.whitehouse/Anaconda3-2022.05-Linux-x86_64.sh"};
OutputSandbox = {"StdOut","StdErr"};
Site = "LCG.UKI-LT2-IC-HEP.uk";
Tags = {"GPU"}
]

In our InputSandbox we have 3 scripts:

  • A bash wrapper
  • Our python script which represents the GPU job
  • A freshly downloaded x86_64 installer from the Anaconda website - we upload this to a SE as it contains binary data and is rather large

The bash script simply installs Anaconda to our jobs scratch area, sources "conda.sh", lists the environment and then activates it. It then installs "cudatoolkit" to provide GPU support for Anaconda python packages, and the "numba" package which can make use of GPU. Finally it executes the python script:

#!/bin/sh
./Anaconda3-2022.05-Linux-x86_64.sh -p ${PWD}/gputest -b
source ${PWD}/gputest/etc/profile.d/conda.sh
conda info -e
conda activate base
conda install cudatoolkit numba
./gpu_test.py
#!/usr/bin/env python
from numba import cuda
print(cuda.gpus)

If we submit the job and look at the last bit of our output:

<Managed Device 0>

We can see that we can access our GPU using python.


Container Example Using Dirac

Dirac provides Singularity (https://apptainer.org/) which can be used for running containers. You can use Singularity within Dirac jobs, but you can also use it "interactively" if you have activated your Dirac user interface.