Skip to content

Slurm commands and options

srun/sbatch

The following options are intended to be used with sbatch or srun.

Take a look at FAQ-Page

resource options

option purpose examples
--ntasks=<count> set number of tasks for this job --ntasks=256
--ntasks-per-node=<count> set number of tasks per node --ntasks-per-node=128
--cpus-per-task=<count> set the number of cpus per task
(default value: 1)
--cpus-per-task=2
--contstraint=<feature> only use nodes which have this feature --constraint=HighMem
--exclude=<nodenames> do not use nodes passed
also a path to a list may specified
--exclude=r11n01
--exclude=./exlude-list.txt
--nodelist=<nodenames> only use nodes passed
also a path to a list may specified
--nodelist=r11n01
--exclude=./nodelist.txt
--gres=<name>[[:type]:count] generic resourece specifier (per node) any gpu --gres=gpu:1
2 h100 gpus ----gres=gpu:h100:2

timing options

option
purpose
example
--deadline=<TIME> Remove job if no ending is possible before
this deadline. Timeformats:
YYYY-MM-DD[THH:MM[:SS]]
HH:MM[:SS] [AM|PM]
end before 10th jan 13h:
--deadline=2024-01-10T13:00
--beginn=<TIME> Defer joballocation until the specified time.Timeformats:
like –deadline=
and
“now”+TIME
begin 16:00 --begin=16:00
begin an hour after submit time
--begin=now+1:00:00
--time=<TIME> Specify jobs walltime limit. Set job duration to 9 hours
--time=09:00:00

other useful options (srun/sbatch)

option purpose examples
--reservation=<names> Allocate resources from named partitions. --reservation=lscale_test
--partition=<partition_name> Choose a partition to run a job.
(May necessary for some
resources, like h100.)
--partition=dev

signaling options (scancel)

These are option for canceling / signaling jobs

Take a look at advanced section

option purpose examples
--full Pass signal down to jobsteps. --full
--jobname=<job_name> Restrict the scancel operation
to jobs with this job name
--jobname=RUN_Z
--signal=<signal_[name|number]> Number of the signal to send. (Default: KILL) --signal=USR1
--signal=10
--state=<job_state_name> Execute cancel action only on
jobs that are in the given state
[“PENDING”, “RUNNING”
or “SUSPENDED”].
--state="PENDING"