$ squeue --start
Job_priority = (PriorityWeightAge * age_factor) +
(PriorityWeightQOS * QOS_factor) +
(PriorityWeightPartition * partition_factor) +
(PriorityWeightJobSize * job_size_factor) +
(PriorityWeightFairshare * fair-share_factor) +
(PriorityWeightAssoc * assoc_factor) +
SUM(TRES_weight_<type> * TRES_factor_<type>, ...)
- nice_factor + site_factor
We can see the values of the weights in
etc/slurm/slurm.conf
or with
sprio -w
This is the factor that will make your job more likely to start the more time it spends in the queue
It will increase linearly over time, reaching 1.0 at PriorityMaxAge
(currently, PriorityMaxAge = 7 days on Sunbird, and PriorityWeightAge = 1000)
The time spent in the queue while the job cannot run because of dependencies or because it is withheld does not count.
These can be visualised
with
sacctmgr show qos
The priorities shown are normalised to the highest
These can be visualised
with
scontrol show partitions
Notice that every partition has an associated QOS that will carry its own priority factor.
The factor is equal to 1.0 for a job that requests all the nodes on the machine.
Goal: prioritize jobs from recently under-serviced accounts, and de-prioritize jobs from recently over-serviced accounts.
Two algorithms:
From the documentation:
The association is a combination of cluster, account, user names and optional partition name.
We can look at all of them with the command
sshare
, or
sacctmgr show associations
rank=user_assoc_count
(rank--/user_assoc_count)
(From the docs)
LF = S/U
Where
S = Srawself/Srawself+siblings
,
U = Urawself/Urawself+siblings
,
Sraw
represents the shares assigned to the association,
while Uraw
represents the resource usage...
(From the docs)
UH= Ut +
D Ut-1 + D2 Ut-2
PriorityDecayHalfLife
variable.
(From the docs)
(Or, at least, this is true in the Classic Fair Share algorithm, there is no clear mention of this in the Fair Tree algorithm page).
sprio -l
: look at terms in the priority formulasshare -l
look at fair share-related quantitiesBackfill scheduling will start lower priority jobs if doing so does not delay the expected start time of any higher priority jobs.
(From the docs and this presentation)