Maui Preemption
Contents
Maui Preemption
Introduction
Preemption is the name given by maui to the ability to suspend running jobs to allow higher priority jobs to run, once the job has completed the job that has been suspended is restarted.
Software Versions
Maui preemption was tested on maui version 3.2.6p14 (maui-3.2.6p14-3_SL30X_ratio01) and torque 2.1.5 (torque-2.1.5-1cri_sl3_1st). Maui and torque were built by Steve Traylen.
How It Works
Preemption requires several different parameters to be set, if one is missed out preemption will not work. These are:-
PREEMPTPOLICY
The PREEMPTPOLICY policy has two possible options, namely -
REQUEUE - This terminates the job and requeues the job SUSPEND - This suspends the job and restarts it once the job that causes the preemption completes
BACKFILLPOLICY
The BACKFILLPOLICY has four possible options namely -
FIRSTFIT - This causes jobs to be scheduled based upon the priorities set on the queues, this is the default BESTFIT - This will cause the priority of jobs to be changed to allow for best use of the batch system. NONE - Disable BACKFILL GREEDY -
RESERVATIONPOLICY
The RESERVATIONPOLICY parameter has three possible options namely -
CURRENTHIGHEST HIGHEST NONE
N.B. In order for preemption to take place maui first sends the suspend signal to the job that has to suspend, on the next iteration of the schedular will start the preemptor job. Do not be surprised to see jobs suspended in the queue yet have the preemptor jobs still waiting to run.
PREEMPTOR's and PREEMPTEE's
The preemption facilities within Maui are not as flexible as those in other batching systems such as Sun Grid Engine. Queues can be either a preemptor (i.e., a queue whose jobs can preempt others) or a preemptee (i.e., a queue whose jobs can be preempted) but not both. You can define one queue as a preemptor and several as preemptee's or several queues as preemptor's and one as a preemptee.
QOSCFG
In order to configure premption activate the QOSWEIGHT, to do this set the variable QOSWEIGHT to a number greater than 1.
QOSWEIGHT 1
You can then use the QOSCFG parameters to define which queues are preemptors and which are preempteed
QOSCFG[short] QFLAGS=PREEMPTOR QOSCFG[long] QFLAGS=PREEMPTEE
CLASSCFG
Once you have defined which queues are preemptors and which are preemptees you the use the CLASSCFG to define priorities. As with the QOSCFG options you need to turn on the CLASSCFG options by setting the CLASSWEIGHT
CLASSWEIGHT 1
You then define CLASSCFG options for each QUEUE and define a priority and associate a QOSCFG with each CLASSCFG using the QDEF parameter
CLASSCFG[short] QDEF=short PRIORITY=11000 CLASSCFG[long] QDEF=long PRIORITY=10000
N.B. With priorities the higher the number, the higher the priority
Examples
One Preemptor and Multiple Preemptee's
In this example we have one queue (ops) as a preemptor and three queues as preemptee's (short, medium and long)
SERVERHOST helmsley.dur.scotgrid.ac.uk ADMIN1 root ADMIN3 edginfo rgma ADMINHOST helmsley.dur.scotgrid.ac.uk RMCFG[base] TYPE=PBS # SERVERPORT 40559 SERVERMODE NORMAL # # Set PBS server polling interval. If you have short # queues or/and jobs it is worth to set a short interval. (10 seconds) # RMPOLLINTERVAL 00:00:30 # # a max. 10 MByte log file in a logical location LOGFILE /var/log/maui.log LOGFILEMAXSIZE 10000000 LOGLEVEL 2 NODEACCESSPOLICY shared # # Set the delay to 1 minute before Maui tries to run a job again, # in case it failed to run the first time. # The default value is 1 hour. # DEFERTIME 00:01:00 # # Set preempt policy to suspend if possible # PREEMPTPOLICY SUSPEND BACKFILLPOLICY FIRSTFIT RESERVATIONPOLICY CURRENTHIGHEST NODEALLOCATIONPOLICY MINRESOURCE # # Fair share policy # FSPOLICY DEDICATEDPS FSINTERVAL 24:00:00 FSQOSWEIGHT 2 # QOSWEIGHT 1 QOSCFG[ops] QFLAGS=PREEMPTOR # QOSCFG[short] QFLAGS=PREEMPTEE QOSCFG[medium] QFLAGS=PREEMPTEE QOSCFG[long] QFLAGS=PREEMPTEE # CLASSWEIGHT 1 CLASSCFG[DEFAULT] PRIORITY=100 CLASSCFG[ops] QDEF=ops PRIORITY=11000 CLASSCFG[short] QDEF=short PRIORITY=9000 CLASSCFG[medium] QDEF=medium PRIORITY=8000 CLASSCFG[long] QDEF=long PRIORITY=6000 # CONSUMEDWEIGHT 3 CREDWEIGHT 1 GROUPWEIGHT 1 USERWEIGHT 1 SRCFGWEIGHT 2 # # Taken from the MAUI cookbook # QUEUETIMEWAIT 1 # Don't work as a FIFO # ENABLENEGJOBPRIORITY true REJECTNEGPRIOJOB false # NODEALLOCATIONPOLICY PRIORITY NODECFG[DEFAULT] SLOT=2 # # Trying node set # (http://www.supercluster.org/documentation/maui/8.3nodesetoverview.shtml) # NODESETPOLICY ONEOF NODESETATTRIBUTE FEATURE # # Stops defered jobs # DEFERTIME 0
Multiple Preemptor's and One Preemptee
In this example we have four queues that are preemtor's (ops,short, medium and long) and one queue (grid) that is a preemptee
# SERVERHOST helmsley.dur.scotgrid.ac.uk ADMIN1 root ADMIN3 edginfo rgma ADMINHOST helmsley.dur.scotgrid.ac.uk RMCFG[base] TYPE=PBS # SERVERPORT 40559 SERVERMODE NORMAL # # Set PBS server polling interval. If you have short # queues or/and jobs it is worth to set a short interval. (10 seconds) # RMPOLLINTERVAL 00:00:30 # # a max. 10 MByte log file in a logical location # LOGFILE /var/log/maui.log LOGFILEMAXSIZE 10000000 LOGLEVEL 2 NODEACCESSPOLICY shared # Set the delay to 1 minute before Maui tries to run a job again, # in case it failed to run the first time. # The default value is 1 hour. # DEFERTIME 00:01:00 # # # Set preempt policy to suspend if possible # PREEMPTPOLICY SUSPEND # BACKFILLPOLICY FIRSTFIT RESERVATIONPOLICY CURRENTHIGHEST NODEALLOCATIONPOLICY MINRESOURCE # # Fair share policy # # FSPOLICY DEDICATEDPS FSINTERVAL 24:00:00 FSQOSWEIGHT 2 # QOSWEIGHT 1 QOSCFG[ops] QFLAGS=PREEMPTOR QOSCFG[short] QFLAGS=PREEMPTOR QOSCFG[medium] QFLAGS=PREEMPTOR QOSCFG[long] QFLAGS=PREEMPTOR QOSCFG[grid] QFLAGS=PREEMPTEE # CLASSWEIGHT 1 # CLASSCFG[DEFAULT] PRIORITY=100 CLASSCFG[ops] QDEF=ops PRIORITY=11000 CLASSCFG[short] QDEF=short PRIORITY=9000 CLASSCFG[medium] QDEF=medium PRIORITY=8000 CLASSCFG[long] QDEF=long PRIORITY=6000 CLASSCFG[grid] QDEF=grid PRIORITY=4000 # CONSUMEDWEIGHT 3 CREDWEIGHT 1 GROUPWEIGHT 1 USERWEIGHT 1 SRCFGWEIGHT 2 # # Taken from the MAUI cookbook # # QUEUETIMEWAIT 1 # Don't work as a FIFO # ENABLENEGJOBPRIORITY true REJECTNEGPRIOJOB false # NODEALLOCATIONPOLICY PRIORITY # # # Trying node set # (http://www.supercluster.org/documentation/maui/8.3nodesetoverview.shtml) # # NODESETPOLICY ONEOF NODESETATTRIBUTE FEATURE # # Stops defered jobs # DEFERTIME 0
Checking Preemption Works
The first check is to run the command showconfig |less and check that the number PREEMPTOR and PREEMPTEES in the configuration is the same as the number in the maui.cfg file. The second check is to load the preemptee queue(s) and the submit jobs into the preemptor queue and check that the jobs in the preemptee queue(s) are suspended and restarted once the jobs in the preemptor queues complete.