Jobs submission – Nestum Cluster

Quick navigation – Interactive jobs, Parallel batch jobs, Parallels hybrid jobs, Serial Batch jobs, Array Jobs

The Nestum cluster use SLURM batch system for controlling user jobs. One good introduction using slurm can be found here. Convenient slurm commands are listed below. A Postfix client, installed on the cluster, will notify you of the status of your jobs via email. You should note that by default the number of cores used for your job is 1. Also, the default memory allocated for your job is 2GB. If your job uses more than that, you’ll get an error that your job Exceeded job memory limit. To set a larger (or smaller) limit, add to your job submission:

#SBATCH --mem-per-cpu X

where X is the default amount of memory in MB if not specified. This is especially important for array jobs. In addition there is several environment variables associated with particular slurm job.

SLURM_JOB_ID – each slurm job (or task) has unique id reserved at submission. The values is stored in SLURM_JOB_ID environment variable
SLURM_TMPDIR – For each slurm job there is temporal folder associated with maximal size up to 110 GB per compute node. Please keep in case when your job doesn’t occupy entire compute node temporal storage my be shared with another slurm job.
SLURM_NPROCS – number of reserved computational cores for the job

Submit interactive job

You can have interactive jobs with or without X11 support. In order to have an interactive job without X11 support (interactive batch job) you can simply run following command:

srun --pty /bin/bash

In case you need to reserve more than one compute core, you can use additional option -n. For example in order to run interactive job with 16 reserved compute cores you can execute command:

srun -n16 --pty /bin/bash

If you need interactive jobs with X11 for GUI enabled applications you should add following parameter --x11=first to the srun command:

srun --pty --x11=first xterm

or, first start interactive shell session with X11 forwarding enabled and then run the application such as: xterm, gnuplot, etc.

srun --pty --x11=first /bin/bash
xterm

When X11 forwarding is enabled using --x11 option the user is restricted only to one computational core.

Submit Parallel batch job

The batch job’s in slurm are submitted with sbatch command and they represent simple shell script files with additional parameters passed to sbatch escaped with #SBATCH. For example lets consider following bash shell script my.job.

#!/bin/bash
#
#SBATCH -p medium.p               # partition (queue)
#SBATCH -N 2                      # number of nodes
#SBATCH -n 64                     # number of tasks
#SBATCH -t 0-2:00                 # time (D-HH:MM)
#SBATCH -o slurm.%N.%j.out        # STDOUT
#SBATCH -e slurm.%N.%j.err        # STDERR
#SBATCH --mail-type=<type>        # notification trigger
#SBATCH --mail-user=<user>        # email address

module load openmpi

mpirun helloworld.x

Although the command line parameters in above script are self explanatory and well document in sbatch man page. Lets have a few words for each option

#SBATCH -p medium.p               # partition (queue)

set the partition (queue) in which job will be submitted. If this option is omitted default queue is used. Next option

#SBATCH -N 2                      # number of nodes

set the number of nodes which will be allocated for the job in our case they are 2 in addition we need to set and total number of compute cores

#SBATCH -n 64                     # number of cores

set the number of task’s to be executed. Since the default number of cpus-per-task is 1 and each of requested 2 compute has 32 cores the total number if 64 cores will be allocated. Th execution time is specified with -t option

#SBATCH -t 0-2:00                 # time (D-HH:MM)

if this valued is omitted default value is 10 minutes. And finally the standard output and standard error stream can be redirected into files slurm.%N.%j.out and slurm.%N.%j.out respectively where %N represent the Node id and %j is a task id.

#SBATCH --mail-type=<type>
#SBATCH --mail-user=<user>

these lines set the rules for email notification. Setting these is not compulsory. The can be ALL, BEGIN, END and FAIL, which are self-explanatory. The field should be the desired email address. If left blank, notifications will be send on the email address associated with the user account.

Submitting job is quite simple

sbatch -t 0-3:00 my.job

in above example the command line option -t will override the option in job file.

Parallel hybrid jobs

Running hybrid parallel MPI+OpenMPI jobs will require additional sbatch option cpus-per-task and the environment variable OMP_NUM_THREADS need to set equal to SLURM_CPUS_PER_TASK. For example let’s say we want to run 4 MPI processes with 16 threads each in parallel, here is an example my.job

#!/bin/bash
#
#SBATCH -n 4                      # number of tasks
#SBATCH --cpus-per-task 16        # number of cpu per task
#SBATCH -t 0-2:00                 # time (D-HH:MM)
#SBATCH -o slurm.%j.out           # STDOUT
#SBATCH -e slurm.%j.err           # STDERR
#SBATCH --mail-type=<type>        # notification trigger
#SBATCH --mail-user=<user>        # email address

module load openmpi

export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

mpirun ./myhybridparallel.x

Serial batch jobs

In case you need to submit serial batch job which will occupy only one core it’s important to specify only the -n 1 option

#!/bin/bash
#
#SBATCH -n 1                      # number of tasks
#SBATCH -t 0-2:00                 # time (D-HH:MM)
#SBATCH -o slurm.%j.out           # STDOUT
#SBATCH -e slurm.%j.err           # STDERR
#SBATCH --mail-type=<type>        # notification trigger
#SBATCH --mail-user=<user>        # email address

# load any necessary software module if it's required

./myprog.x

Array jobs

Job arrays offer a mechanism for submitting and managing collections of similar jobs quickly and easily. All jobs must have the same initial options (e.g. size, time limit, etc.).

Job arrays are only supported for batch jobs and the array index values are specified using the --array or -a option of the sbatch command. The option argument can be specific array index values, a range of index values, and an optional step size as shown in the examples below. Note that the minimum index value is zero and the maximum value is a Slurm configuration parameter (MaxArraySize minus one). Jobs which are part of a job array will have the environment variable SLURM_ARRAY_TASK_ID set to its array index value.

# Submit a job array with index values between 0 and 31
$ sbatch --array=0-31    -N1 tmp

# Submit a job array with index values of 1, 3, 5 and 7
$ sbatch --array=1,3,5,7 -N1 tmp

# Submit a job array with index values between 1 and 7
# with a step size of 2 (i.e. 1, 3, 5 and 7)
$ sbatch --array=1-7:2   -N1 tmp

A maximum number of simultaneously running tasks from the job array may be specified using a “%” separator. For example --array=0-15%4 will limit the number of simultaneously running tasks from this job array to 4.

List job’s

In order to list all running job’s you can use the command squeue without any additional arguments:

squeue

if you like to get the running job’s of particular user use command switch -c

squeue -u user

Cancel job

In case you need to cancel running job. First obtain job id using squeue command, then use command scancel

scancel job_id

Queue status

The state of computational nodes in slurm environment can be reviewed using command

sinfo