Difference between revisions of "HTCondor Jobs In Containers"
(→Singularity) |
(→Image) |
||
(60 intermediate revisions by one user not shown) | |||
Line 1: | Line 1: | ||
− | This page | + | This page describes the setup at RAL for running jobs in Centos 6 containers on SL7 worker nodes. |
− | == | + | == Worker nodes == |
− | The SL7 | + | The SL7 worker nodes are configured to be as close as possible to our original SL6 workernodes. Exceptions: |
+ | * No grid middleware installed | ||
+ | * No HEP_OSlibs_SL6 RPM and dependencies or equivalent RPMs | ||
+ | * glexec, lcas and lcmaps configuration files only (no RPMs are required) | ||
− | + | We are currently using Docker 17.03.0-ce. We found that the only reliable choice for the storage driver is OverlayFS, which seems to be the default now for RHEL7-based systems. The file <code>/etc/docker/daemon.json</code> contains: | |
− | + | { | |
− | + | "storage-driver": "overlay", | |
− | + | "graph": "/pool/docker" | |
− | + | } | |
− | + | The partition <code>/pool</code> is an XFS filesystem which is formatted including the option <code>-n ftype=1</code>. This is essential. Without this there will be lots of kernel errors. | |
− | Add the | + | We are using CVMFS 2.3.2. In order to ensure that CVMFS mounts survive autofs restarts, the file <code>/etc/systemd/system/autofs.service.d/fuse.conf</code> should be created: |
− | + | [Service] | |
− | + | KillMode=process | |
− | + | ||
− | + | HTCondor 8.6.3 or above is recommended. Add the following to sudoers to enable HTCondor to use the Docker CLI as root: | |
− | DOCKER_VOLUME_DIR_CVMFS=/cvmfs:/cvmfs:ro | + | User_Alias CONDORUSER = condor |
+ | Cmnd_Alias DOCKERCMD = /usr/bin/docker | ||
+ | CONDORUSER ALL = NOPASSWD: DOCKERCMD | ||
+ | and add the following line to the HTCondor configuration: | ||
+ | DOCKER = sudo /usr/bin/docker | ||
+ | The alternative method of giving HTCondor permission to run containers, i.e. adding the condor user to the docker group, is problematic with Docker 1.13.1 and above (Docker commands will try to read a config file from /root and not have permission to do so). | ||
+ | |||
+ | Our full HTCondor configuration relating to Docker is as follows: | ||
+ | DOCKER = sudo /usr/bin/docker | ||
+ | DOCKER_DROP_ALL_CAPABILITIES=regexp("pilot",x509UserProxyFirstFQAN) =?= False | ||
+ | DOCKER_MOUNT_VOLUMES=GRID_SECURITY, MJF, GRIDENV, GLEXEC, LCMAPS, LCAS, PASSWD, GROUP, CVMFS, CGROUPS, ATLAS_RECOVERY, ETC_ATLAS, ETC_CMS, ETC_ARC | ||
+ | DOCKER_VOLUME_DIR_ATLAS_RECOVERY=/pool/atlas/recovery:/pool/atlas/recovery | ||
+ | DOCKER_VOLUME_DIR_ATLAS_RECOVERY_MOUNT_IF=regexp("atl",Owner) | ||
+ | DOCKER_VOLUME_DIR_CGROUPS=/sys/fs/cgroup:/sys/fs/cgroup:ro | ||
+ | DOCKER_VOLUME_DIR_CGROUPS_MOUNT_IF=regexp("atl",Owner) | ||
+ | DOCKER_VOLUME_DIR_CVMFS=/cvmfs:/cvmfs:shared | ||
+ | DOCKER_VOLUME_DIR_ETC_ARC=/etc/arc:/etc/arc:ro | ||
+ | DOCKER_VOLUME_DIR_ETC_ATLAS=/etc/atlas:/etc/atlas:ro | ||
+ | DOCKER_VOLUME_DIR_ETC_ATLAS_MOUNT_IF=regexp("atl",Owner) | ||
+ | DOCKER_VOLUME_DIR_ETC_CMS=/etc/cms:/etc/cms:ro | ||
+ | DOCKER_VOLUME_DIR_ETC_CMS_MOUNT_IF=regexp("cms",Owner) | ||
+ | DOCKER_VOLUME_DIR_GLEXEC=/etc/glexec.conf:/etc/glexec.conf:ro | ||
+ | DOCKER_VOLUME_DIR_GRIDENV=/etc/profile.d/grid-env.sh:/etc/profile.d/grid-env.sh:ro | ||
DOCKER_VOLUME_DIR_GRID_SECURITY=/etc/grid-security:/etc/grid-security:ro | DOCKER_VOLUME_DIR_GRID_SECURITY=/etc/grid-security:/etc/grid-security:ro | ||
− | |||
DOCKER_VOLUME_DIR_GROUP=/etc/group:/etc/group:ro | DOCKER_VOLUME_DIR_GROUP=/etc/group:/etc/group:ro | ||
− | + | DOCKER_VOLUME_DIR_LCAS=/etc/lcas:/etc/lcas:ro | |
− | + | DOCKER_VOLUME_DIR_LCMAPS=/etc/lcmaps:/etc/lcmaps:ro | |
− | + | DOCKER_VOLUME_DIR_MJF=/etc/machinefeatures:/etc/machinefeatures:ro | |
− | + | ||
DOCKER_VOLUME_DIR_PASSWD=/etc/passwd:/etc/passwd:ro | DOCKER_VOLUME_DIR_PASSWD=/etc/passwd:/etc/passwd:ro | ||
− | + | Some comments on this: | |
− | + | * by default HTCondor drops all Linux capabilities in the containers it runs. This prevents glexec from working, so we unfortunately have to keep all standard capabilities for jobs using the pilot role. | |
− | + | * Directories such as <code>/cvmfs</code>, <code>/etc/grid-security</code>, <code>/etc/machinefeatures</code>, <code>/etc/lcas</code>, <code>/etc/lcmaps</code> are bind mounted into the containers for all jobs | |
− | + | * The glexec config file is bind mounted into the containers | |
− | + | * For ATLAS jobs only, <code>/sys/fs/cgroup</code> and the job recovery directory are bind mounted into the containers | |
− | + | * <code>/etc/passwd</code> and <code>/etc/groups</code> are bind mounted into containers so that the pool accounts are available | |
− | + | ||
− | the | + | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | == | + | == CEs == |
− | + | We need to ensure that jobs are submitted using the Docker universe with the appropriate image specified rather than the default Vanilla universe. Assuming HTCondor 8.6.x is running on the CEs, a schedd job transform can be used: | |
+ | JOB_TRANSFORM_NAMES = DefaultDocker | ||
+ | JOB_TRANSFORM_DefaultDocker @=end | ||
+ | [ | ||
+ | Requirements = JobUniverse == 5 && DockerImage =?= undefined && Owner =!= "nagios"; | ||
+ | set_WantDocker = true; | ||
+ | eval_set_DockerImage = "alahiff/grid-workernode-c6:20170627.1"; | ||
+ | set_Requirements = ( TARGET.HasDocker ) && ( TARGET.Disk >= RequestDisk ) && ( TARGET.Memory >= RequestMemory ) && ( TARGET.Cpus >= RequestCpus ) && ( TARGET.HasFileTransfer ); | ||
+ | copy_TransferInput = "OriginalTransferInput"; | ||
+ | eval_set_TransferInput = strcat(OriginalTransferInput, ",", Cmd); | ||
+ | ] | ||
+ | @end | ||
− | + | == Image == | |
− | + | The Dockerfile for the image in use is here: https://github.com/alahiff/grid-workernode/blob/master/centos6/Dockerfile. The contents of the image are based on the standard SL6 worker nodes at RAL. | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + |
Latest revision as of 10:04, 9 July 2017
This page describes the setup at RAL for running jobs in Centos 6 containers on SL7 worker nodes.
Worker nodes
The SL7 worker nodes are configured to be as close as possible to our original SL6 workernodes. Exceptions:
- No grid middleware installed
- No HEP_OSlibs_SL6 RPM and dependencies or equivalent RPMs
- glexec, lcas and lcmaps configuration files only (no RPMs are required)
We are currently using Docker 17.03.0-ce. We found that the only reliable choice for the storage driver is OverlayFS, which seems to be the default now for RHEL7-based systems. The file /etc/docker/daemon.json
contains:
{ "storage-driver": "overlay", "graph": "/pool/docker" }
The partition /pool
is an XFS filesystem which is formatted including the option -n ftype=1
. This is essential. Without this there will be lots of kernel errors.
We are using CVMFS 2.3.2. In order to ensure that CVMFS mounts survive autofs restarts, the file /etc/systemd/system/autofs.service.d/fuse.conf
should be created:
[Service] KillMode=process
HTCondor 8.6.3 or above is recommended. Add the following to sudoers to enable HTCondor to use the Docker CLI as root:
User_Alias CONDORUSER = condor Cmnd_Alias DOCKERCMD = /usr/bin/docker CONDORUSER ALL = NOPASSWD: DOCKERCMD
and add the following line to the HTCondor configuration:
DOCKER = sudo /usr/bin/docker
The alternative method of giving HTCondor permission to run containers, i.e. adding the condor user to the docker group, is problematic with Docker 1.13.1 and above (Docker commands will try to read a config file from /root and not have permission to do so).
Our full HTCondor configuration relating to Docker is as follows:
DOCKER = sudo /usr/bin/docker DOCKER_DROP_ALL_CAPABILITIES=regexp("pilot",x509UserProxyFirstFQAN) =?= False DOCKER_MOUNT_VOLUMES=GRID_SECURITY, MJF, GRIDENV, GLEXEC, LCMAPS, LCAS, PASSWD, GROUP, CVMFS, CGROUPS, ATLAS_RECOVERY, ETC_ATLAS, ETC_CMS, ETC_ARC DOCKER_VOLUME_DIR_ATLAS_RECOVERY=/pool/atlas/recovery:/pool/atlas/recovery DOCKER_VOLUME_DIR_ATLAS_RECOVERY_MOUNT_IF=regexp("atl",Owner) DOCKER_VOLUME_DIR_CGROUPS=/sys/fs/cgroup:/sys/fs/cgroup:ro DOCKER_VOLUME_DIR_CGROUPS_MOUNT_IF=regexp("atl",Owner) DOCKER_VOLUME_DIR_CVMFS=/cvmfs:/cvmfs:shared DOCKER_VOLUME_DIR_ETC_ARC=/etc/arc:/etc/arc:ro DOCKER_VOLUME_DIR_ETC_ATLAS=/etc/atlas:/etc/atlas:ro DOCKER_VOLUME_DIR_ETC_ATLAS_MOUNT_IF=regexp("atl",Owner) DOCKER_VOLUME_DIR_ETC_CMS=/etc/cms:/etc/cms:ro DOCKER_VOLUME_DIR_ETC_CMS_MOUNT_IF=regexp("cms",Owner) DOCKER_VOLUME_DIR_GLEXEC=/etc/glexec.conf:/etc/glexec.conf:ro DOCKER_VOLUME_DIR_GRIDENV=/etc/profile.d/grid-env.sh:/etc/profile.d/grid-env.sh:ro DOCKER_VOLUME_DIR_GRID_SECURITY=/etc/grid-security:/etc/grid-security:ro DOCKER_VOLUME_DIR_GROUP=/etc/group:/etc/group:ro DOCKER_VOLUME_DIR_LCAS=/etc/lcas:/etc/lcas:ro DOCKER_VOLUME_DIR_LCMAPS=/etc/lcmaps:/etc/lcmaps:ro DOCKER_VOLUME_DIR_MJF=/etc/machinefeatures:/etc/machinefeatures:ro DOCKER_VOLUME_DIR_PASSWD=/etc/passwd:/etc/passwd:ro
Some comments on this:
- by default HTCondor drops all Linux capabilities in the containers it runs. This prevents glexec from working, so we unfortunately have to keep all standard capabilities for jobs using the pilot role.
- Directories such as
/cvmfs
,/etc/grid-security
,/etc/machinefeatures
,/etc/lcas
,/etc/lcmaps
are bind mounted into the containers for all jobs - The glexec config file is bind mounted into the containers
- For ATLAS jobs only,
/sys/fs/cgroup
and the job recovery directory are bind mounted into the containers -
/etc/passwd
and/etc/groups
are bind mounted into containers so that the pool accounts are available
CEs
We need to ensure that jobs are submitted using the Docker universe with the appropriate image specified rather than the default Vanilla universe. Assuming HTCondor 8.6.x is running on the CEs, a schedd job transform can be used:
JOB_TRANSFORM_NAMES = DefaultDocker JOB_TRANSFORM_DefaultDocker @=end [ Requirements = JobUniverse == 5 && DockerImage =?= undefined && Owner =!= "nagios"; set_WantDocker = true; eval_set_DockerImage = "alahiff/grid-workernode-c6:20170627.1"; set_Requirements = ( TARGET.HasDocker ) && ( TARGET.Disk >= RequestDisk ) && ( TARGET.Memory >= RequestMemory ) && ( TARGET.Cpus >= RequestCpus ) && ( TARGET.HasFileTransfer ); copy_TransferInput = "OriginalTransferInput"; eval_set_TransferInput = strcat(OriginalTransferInput, ",", Cmd); ] @end
Image
The Dockerfile for the image in use is here: https://github.com/alahiff/grid-workernode/blob/master/centos6/Dockerfile. The contents of the image are based on the standard SL6 worker nodes at RAL.