RAL HTCondor Multicore Jobs Configuration

From GridPP Wiki
Revision as of 08:32, 21 May 2014 by Andrew Lahiff b13e4f09e2 (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Defrag daemon setup

Our current configuration for the defrag daemon is:

DAEMON_LIST = $(DAEMON_LIST) DEFRAG
DEFRAG_INTERVAL = 600
DEFRAG_DRAINING_MACHINES_PER_HOUR = 30.0
DEFRAG_MAX_CONCURRENT_DRAINING = 60
DEFRAG_MAX_WHOLE_MACHINES = 300
DEFRAG_SCHEDULE = graceful
## Allow some defrag configuration to be settable DEFRAG.SETTABLE_ATTRS_ADMINISTRATOR = DEFRAG_MAX_CONCURRENT_DRAINING,DEFRAG_DRAINING_MACHINES_PER_HOUR,DEFRAG_MAX_WHOLE_MACHINES ENABLE_RUNTIME_CONFIG = TRUE
## Which machines are more desirable to drain DEFRAG_RANK = ifThenElse(Cpus >= 8, -10, (TotalCpus - Cpus)/(8.0 - Cpus))
# Definition of a "whole" machine: # - anything with 8 cores (since multicore jobs only need 8 cores, don't need to drain whole machines with > 8 cores) # - must be configured to actually start new jobs (otherwise machines which are deliberately being drained will be included) DEFRAG_WHOLE_MACHINE_EXPR = ((Cpus == TotalCpus) || (Cpus >= 8)) && StartJobs =?= True
# Decide which machines to drain # - must not be cloud machines # - must be healthy # - must be configured to actually start new jobs DEFRAG_REQUIREMENTS = PartitionableSlot && Offline =!= True && RalCluster =!= "wn-cloud" && StartJobs =?= True && NODE_IS_HEALTHY =?= True
## Logs MAX_DEFRAG_LOG = 104857600 MAX_NUM_DEFRAG_LOG = 10