Enable Cgroups in HTCondor

From GridPP Wiki
Jump to: navigation, search

Following are the steps to enable cgroups on an HTCondor WN. In this example we use node067 as a representative node:

1. ensure libcgroup package is installed, if not, yum install it.

node067:~# rpm -qa | grep cgroup
libcgroup-0.37-7.el6.x86_64

2. add a group htcondor to /etc/cgconfig.conf

node067:~# cat /etc/cgconfig.conf
mount {
      cpu     = /cgroup/cpu;
      cpuset  = /cgroup/cpuset;
      cpuacct = /cgroup/cpuacct;
      devices = /cgroup/devices;
      memory  = /cgroup/memory;
      freezer = /cgroup/freezer;
      net_cls = /cgroup/net_cls;
      blkio   = /cgroup/blkio;
}
group htcondor {
      cpu {}
      cpuacct {}
      memory {}
      freezer {}
      blkio {}
}

3. start the cgconfig daemon, a directory htcondor will be created under /cgroup/*/

node067:~#  service cgconfig start;
node067:~#  chkconfig cgconfig on
node067:~# ll -d  /cgroup/memory/htcondor/
drwxr-xr-x. 66 root root 0 Oct  9 11:58 /cgroup/memory/htcondor/

4. in the condor WN configuration , add the following lines for STARTD daemon and then restart the startd daemon:

# Enable CGROUP control
BASE_CGROUP = htcondor
# hard: job can't access more physical memory than allocated
# soft: job can access more physical memory than allocated when there are free memory
CGROUP_MEMORY_LIMIT_POLICY = soft

Then when there are jobs running on this WN, there will be a list of condor_tmp_condor_slot* directories created under /cgroup/*/htcondor/:

node067:~# ll -d  /cgroup/memory/htcondor/condor_tmp_condor_slot1_*
drwxr-xr-x. 2 root root 0 Oct  8 09:22 /cgroup/memory/htcondor/condor_tmp_condor_slot1_10@node067.beowulf.cluster
drwxr-xr-x. 2 root root 0 Oct  9 06:26 /cgroup/memory/htcondor/condor_tmp_condor_slot1_11@node067.beowulf.cluster
drwxr-xr-x. 2 root root 0 Oct  9 05:02 /cgroup/memory/htcondor/condor_tmp_condor_slot1_12@node067.beowulf.cluster
drwxr-xr-x. 2 root root 0 Oct  9 05:18 /cgroup/memory/htcondor/condor_tmp_condor_slot1_13@node067.beowulf.cluster
drwxr-xr-x. 2 root root 0 Oct  9 10:42 /cgroup/memory/htcondor/condor_tmp_condor_slot1_14@node067.beowulf.cluster
drwxr-xr-x. 2 root root 0 Oct  8 12:32 /cgroup/memory/htcondor/condor_tmp_condor_slot1_15@node067.beowulf.cluster
drwxr-xr-x. 2 root root 0 Oct  9 06:52 /cgroup/memory/htcondor/condor_tmp_condor_slot1_16@node067.beowulf.cluster
drwxr-xr-x. 2 root root 0 Oct  9 08:43 /cgroup/memory/htcondor/condor_tmp_condor_slot1_17@node067.beowulf.cluster
drwxr-xr-x. 2 root root 0 Oct  9 06:14 /cgroup/memory/htcondor/condor_tmp_condor_slot1_18@node067.beowulf.cluster

From these directories you can retrieve the recorded information.

More information can be found in the condor manual:


Glasgow Scheduling Modifications

To improve scheduling on the Glasgow cluster we statically assign a memory amount based on the type of job in the system. This allows fin grained control over our Memory overcommit and allows us to restrict the number of jobs we run on our memory constrained systems. There are other ways to do this but this allows us to play with the parameters to see what works.

In the submit-condor-job found in /usr/share/arc/ we alter the following section to look like the below:

############################################################## 
# Requested memory (mb)
##############################################################
set_req_mem
if [ ! -z "$joboption_memory" ] ; then
 memory_bytes=2000*1024
 memory_req=2000
 # HTCondor needs to know the total memory for the job, not memory per core
 if [ ! -z $joboption_count ] && [ $joboption_count -gt 1 ] ; then
    memory_bytes=$(( $joboption_count * 2000 * 1024 ))
    memory_req=$(( $joboption_count * 2000 ))
 fi
 memory_bytes=$(( $memory_bytes + 4000 * 1024  ))  # +4GB extra as hard limit
 echo "request_memory=$memory_req" >> $LRMS_JOB_DESCRIPT
 echo "+JobMemoryLimit=$memory_bytes" >> $LRMS_JOB_DESCRIPT
 REMOVE="${REMOVE} || ResidentSetSize > JobMemoryLimit"
fi