Quick navigation – Interactive jobs, Parallel batch jobs, Parallels hybrid jobs, Serial Batch jobs, Array Jobs
The Nestum cluster use SLURM batch system for controlling user jobs. One good introduction using slurm can be found here. Convenient slurm commands are listed below. A Postfix client, installed on the cluster, will notify you of the status of your jobs via email. You should note that by default the number of cores used for your job is 1. Also, the default memory allocated for your job is 2GB. If your job uses more than that, you’ll get an error that your job Exceeded job memory limit
. To set a larger (or smaller) limit, add to your job submission:
#SBATCH --mem-per-cpu X
where X is the default amount of memory in MB if not specified. This is especially important for array jobs. In addition there is several environment variables associated with particular slurm job.
- SLURM_JOB_ID – each slurm job (or task) has unique id reserved at submission. The values is stored in SLURM_JOB_ID environment variable
- SLURM_TMPDIR – For each slurm job there is temporal folder associated with maximal size up to 110 GB per compute node. Please keep in case when your job doesn’t occupy entire compute node temporal storage my be shared with another slurm job.
- SLURM_NPROCS – number of reserved computational cores for the job
You can have interactive jobs with or without X11 support. In order to have an interactive job without X11 support (interactive batch job) you can simply run following command:
srun --pty /bin/bash
In case you need to reserve more than one compute core, you can use additional option -n
. For example in order to run interactive job with 16 reserved compute cores you can execute command:
srun -n16 --pty /bin/bash
If you need interactive jobs with X11 for GUI enabled applications you should add following parameter --x11=first
to the srun
command:
srun --pty --x11=first xterm
or, first start interactive shell session with X11 forwarding enabled and then run the application such as: xterm, gnuplot, etc.
srun --pty --x11=first /bin/bash
xterm
When X11 forwarding is enabled using --x11
option the user is restricted only to one computational core.
Submit Parallel batch job
The batch job’s in slurm are submitted with sbatch
command and they represent simple shell script files with additional parameters passed to sbatch
escaped with #SBATCH. For example lets consider following bash shell script my.job.
#!/bin/bash
#
#SBATCH -p medium.p # partition (queue)
#SBATCH -N 2 # number of nodes
#SBATCH -n 64 # number of tasks
#SBATCH -t 0-2:00 # time (D-HH:MM)
#SBATCH -o slurm.%N.%j.out # STDOUT
#SBATCH -e slurm.%N.%j.err # STDERR
#SBATCH --mail-type=<type> # notification trigger
#SBATCH --mail-user=<user> # email address
module load openmpi
mpirun helloworld.x
Although the command line parameters in above script are self explanatory and well document in sbatch
man page. Lets have a few words for each option
#SBATCH -p medium.p # partition (queue)
set the partition (queue) in which job will be submitted. If this option is omitted default queue is used. Next option
#SBATCH -N 2 # number of nodes
set the number of nodes which will be allocated for the job in our case they are 2 in addition we need to set and total number of compute cores
#SBATCH -n 64 # number of cores
set the number of task’s to be executed. Since the default number of cpus-per-task is 1 and each of requested 2 compute has 32 cores the total number if 64 cores will be allocated. Th execution time is specified with -t option
#SBATCH -t 0-2:00 # time (D-HH:MM)
if this valued is omitted default value is 10 minutes. And finally the standard output and standard error stream can be redirected into files slurm.%N.%j.out and slurm.%N.%j.out respectively where %N represent the Node id and %j is a task id.
#SBATCH --mail-type=<type>
#SBATCH --mail-user=<user>
these lines set the rules for email notification. Setting these is not compulsory. The can be ALL, BEGIN, END and FAIL, which are self-explanatory. The field should be the desired email address. If left blank, notifications will be send on the email address associated with the user account.
Submitting job is quite simple
sbatch -t 0-3:00 my.job
in above example the command line option -t will override the option in job file.
Running hybrid parallel MPI+OpenMPI jobs will require additional sbatch
option cpus-per-task and the environment variable OMP_NUM_THREADS need to set equal to SLURM_CPUS_PER_TASK. For example let’s say we want to run 4 MPI processes with 16 threads each in parallel, here is an example my.job
#!/bin/bash
#
#SBATCH -n 4 # number of tasks
#SBATCH --cpus-per-task 16 # number of cpu per task
#SBATCH -t 0-2:00 # time (D-HH:MM)
#SBATCH -o slurm.%j.out # STDOUT
#SBATCH -e slurm.%j.err # STDERR
#SBATCH --mail-type=<type> # notification trigger
#SBATCH --mail-user=<user> # email address
module load openmpi
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
mpirun ./myhybridparallel.x
In case you need to submit serial batch job which will occupy only one core it’s important to specify only the -n 1
option
#!/bin/bash
#
#SBATCH -n 1 # number of tasks
#SBATCH -t 0-2:00 # time (D-HH:MM)
#SBATCH -o slurm.%j.out # STDOUT
#SBATCH -e slurm.%j.err # STDERR
#SBATCH --mail-type=<type> # notification trigger
#SBATCH --mail-user=<user> # email address
# load any necessary software module if it's required
./myprog.x
Job arrays offer a mechanism for submitting and managing collections of similar jobs quickly and easily. All jobs must have the same initial options (e.g. size, time limit, etc.).
Job arrays are only supported for batch jobs and the array index values are specified using the --array
or -a
option of the sbatch
command. The option argument can be specific array index values, a range of index values, and an optional step size as shown in the examples below. Note that the minimum index value is zero and the maximum value is a Slurm configuration parameter (MaxArraySize
minus one). Jobs which are part of a job array will have the environment variable SLURM_ARRAY_TASK_ID
set to its array index value.
# Submit a job array with index values between 0 and 31
$ sbatch --array=0-31 -N1 tmp
# Submit a job array with index values of 1, 3, 5 and 7
$ sbatch --array=1,3,5,7 -N1 tmp
# Submit a job array with index values between 1 and 7
# with a step size of 2 (i.e. 1, 3, 5 and 7)
$ sbatch --array=1-7:2 -N1 tmp
A maximum number of simultaneously running tasks from the job array may be specified using a “%” separator. For example --array=0-15%4
will limit the number of simultaneously running tasks from this job array to 4.
List job’s
In order to list all running job’s you can use the command squeue
without any additional arguments:
squeue
if you like to get the running job’s of particular user use command switch -c
squeue -u user
Cancel job
In case you need to cancel running job. First obtain job id using squeue
command, then use command scancel
scancel job_id
Queue status
The state of computational nodes in slurm environment can be reviewed using command
sinfo