Maui Preemption

From GridPP Wiki
Jump to: navigation, search

Maui Preemption

Introduction

Preemption is the name given by maui to the ability to suspend running jobs to allow higher priority jobs to run, once the job has completed the job that has been suspended is restarted.

Software Versions

Maui preemption was tested on maui version 3.2.6p14 (maui-3.2.6p14-3_SL30X_ratio01) and torque 2.1.5 (torque-2.1.5-1cri_sl3_1st). Maui and torque were built by Steve Traylen.

How It Works

Preemption requires several different parameters to be set, if one is missed out preemption will not work. These are:-

PREEMPTPOLICY

The PREEMPTPOLICY policy has two possible options, namely -

REQUEUE - This terminates the job and requeues the job
SUSPEND - This suspends the job and restarts it once the job that causes the
          preemption completes

BACKFILLPOLICY

The BACKFILLPOLICY has four possible options namely -

 FIRSTFIT - This causes jobs to be scheduled based upon the priorities set on the queues, this is the default
 BESTFIT - This will cause the priority of jobs to be changed to allow for best use of the batch system.
 NONE - Disable BACKFILL 
 GREEDY - 
 

RESERVATIONPOLICY

The RESERVATIONPOLICY parameter has three possible options namely -

CURRENTHIGHEST
HIGHEST
NONE

N.B. In order for preemption to take place maui first sends the suspend signal to the job that has to suspend, on the next iteration of the schedular will start the preemptor job. Do not be surprised to see jobs suspended in the queue yet have the preemptor jobs still waiting to run.

PREEMPTOR's and PREEMPTEE's

The preemption facilities within Maui are not as flexible as those in other batching systems such as Sun Grid Engine. Queues can be either a preemptor (i.e., a queue whose jobs can preempt others) or a preemptee (i.e., a queue whose jobs can be preempted) but not both. You can define one queue as a preemptor and several as preemptee's or several queues as preemptor's and one as a preemptee.

QOSCFG

In order to configure premption activate the QOSWEIGHT, to do this set the variable QOSWEIGHT to a number greater than 1.

 QOSWEIGHT 1 

You can then use the QOSCFG parameters to define which queues are preemptors and which are preempteed

 QOSCFG[short]  QFLAGS=PREEMPTOR
 QOSCFG[long]   QFLAGS=PREEMPTEE

CLASSCFG

Once you have defined which queues are preemptors and which are preemptees you the use the CLASSCFG to define priorities. As with the QOSCFG options you need to turn on the CLASSCFG options by setting the CLASSWEIGHT

CLASSWEIGHT 1 

You then define CLASSCFG options for each QUEUE and define a priority and associate a QOSCFG with each CLASSCFG using the QDEF parameter

CLASSCFG[short] QDEF=short PRIORITY=11000 
CLASSCFG[long]	QDEF=long PRIORITY=10000 

N.B. With priorities the higher the number, the higher the priority

Examples

One Preemptor and Multiple Preemptee's

In this example we have one queue (ops) as a preemptor and three queues as preemptee's (short, medium and long)


SERVERHOST              helmsley.dur.scotgrid.ac.uk
ADMIN1                  root
ADMIN3			edginfo rgma
ADMINHOST               helmsley.dur.scotgrid.ac.uk
RMCFG[base]             TYPE=PBS
#
SERVERPORT              40559
SERVERMODE              NORMAL
#
# Set PBS server polling interval. If you have short 
#  queues or/and jobs it is worth to set a short interval. (10 seconds)
#
RMPOLLINTERVAL        00:00:30
#
# a max. 10 MByte log file in a logical location
LOGFILE               /var/log/maui.log 
LOGFILEMAXSIZE        10000000
LOGLEVEL              2
NODEACCESSPOLICY	shared
#
# Set the delay to 1 minute before Maui tries to run a  job again, # in case it failed to run the first time.
# The default value is 1 hour.
#
DEFERTIME       00:01:00
#
# 	Set preempt policy to suspend if possible
#
PREEMPTPOLICY		SUSPEND
BACKFILLPOLICY        FIRSTFIT
RESERVATIONPOLICY     CURRENTHIGHEST
NODEALLOCATIONPOLICY  MINRESOURCE
#
#	Fair share policy
#
FSPOLICY	DEDICATEDPS
FSINTERVAL	24:00:00
FSQOSWEIGHT	2 
#
QOSWEIGHT		1
QOSCFG[ops]		QFLAGS=PREEMPTOR
#
QOSCFG[short]		QFLAGS=PREEMPTEE 
QOSCFG[medium]		QFLAGS=PREEMPTEE
QOSCFG[long]		QFLAGS=PREEMPTEE
#
CLASSWEIGHT		1
CLASSCFG[DEFAULT]	PRIORITY=100	
CLASSCFG[ops]		QDEF=ops PRIORITY=11000 
CLASSCFG[short]        QDEF=short PRIORITY=9000 
CLASSCFG[medium]	QDEF=medium PRIORITY=8000	
CLASSCFG[long]		QDEF=long PRIORITY=6000	
#
CONSUMEDWEIGHT		3
CREDWEIGHT		1
GROUPWEIGHT		1
USERWEIGHT		1
SRCFGWEIGHT		2
#
#	Taken from the MAUI cookbook
#
QUEUETIMEWAIT	1	# Don't work as a FIFO
#
ENABLENEGJOBPRIORITY	true
REJECTNEGPRIOJOB	false
#
NODEALLOCATIONPOLICY	PRIORITY
NODECFG[DEFAULT]	SLOT=2
#
#	Trying node set 
# 	(http://www.supercluster.org/documentation/maui/8.3nodesetoverview.shtml) 
#
NODESETPOLICY    ONEOF
NODESETATTRIBUTE FEATURE 
# 
#	Stops defered jobs
#
DEFERTIME	0

Multiple Preemptor's and One Preemptee

In this example we have four queues that are preemtor's (ops,short, medium and long) and one queue (grid) that is a preemptee

#
SERVERHOST              helmsley.dur.scotgrid.ac.uk
ADMIN1                  root
ADMIN3			edginfo rgma
ADMINHOST               helmsley.dur.scotgrid.ac.uk
RMCFG[base]             TYPE=PBS
#
SERVERPORT              40559
SERVERMODE              NORMAL
#
# Set PBS server polling interval. If you have short 
# queues or/and jobs it is worth to set a short interval. (10 seconds)
#
RMPOLLINTERVAL        00:00:30
#
# a max. 10 MByte log file in a logical location
#
LOGFILE               /var/log/maui.log
LOGFILEMAXSIZE        10000000
LOGLEVEL              2
NODEACCESSPOLICY	shared
# Set the delay to 1 minute before Maui tries to run a job again, # in case it failed to run the first time.
# The default value is 1 hour.
#
DEFERTIME       00:01:00
#
#
#	Set preempt policy to suspend if possible
#
PREEMPTPOLICY		SUSPEND
#
BACKFILLPOLICY        FIRSTFIT
RESERVATIONPOLICY     CURRENTHIGHEST
NODEALLOCATIONPOLICY  MINRESOURCE
#
#	Fair share policy
#
#
FSPOLICY	DEDICATEDPS
FSINTERVAL	24:00:00
FSQOSWEIGHT	2
#
QOSWEIGHT		1
QOSCFG[ops]		QFLAGS=PREEMPTOR
QOSCFG[short]		QFLAGS=PREEMPTOR 
QOSCFG[medium]		QFLAGS=PREEMPTOR
QOSCFG[long]		QFLAGS=PREEMPTOR 
QOSCFG[grid]		QFLAGS=PREEMPTEE
#
CLASSWEIGHT		1
# 
CLASSCFG[DEFAULT]	PRIORITY=100	
CLASSCFG[ops]		QDEF=ops PRIORITY=11000 
CLASSCFG[short]	QDEF=short PRIORITY=9000 
CLASSCFG[medium]	QDEF=medium PRIORITY=8000	
CLASSCFG[long]		QDEF=long PRIORITY=6000	
CLASSCFG[grid]		QDEF=grid PRIORITY=4000
#
CONSUMEDWEIGHT		3
CREDWEIGHT		1
GROUPWEIGHT		1
USERWEIGHT		1
SRCFGWEIGHT		2
#
#	Taken from the MAUI cookbook
#
# 
QUEUETIMEWAIT	1	# Don't work as a FIFO
#
ENABLENEGJOBPRIORITY	true
REJECTNEGPRIOJOB	false 
#
NODEALLOCATIONPOLICY	PRIORITY
#
#
#	Trying node set 
#	(http://www.supercluster.org/documentation/maui/8.3nodesetoverview.shtml)
#
#
NODESETPOLICY    ONEOF
NODESETATTRIBUTE FEATURE 
#
#	Stops defered jobs
#
DEFERTIME	0

Checking Preemption Works

The first check is to run the command showconfig |less and check that the number PREEMPTOR and PREEMPTEES in the configuration is the same as the number in the maui.cfg file. The second check is to load the preemptee queue(s) and the submit jobs into the preemptor queue and check that the jobs in the preemptee queue(s) are suspended and restarted once the jobs in the preemptor queues complete.