Interactive Jobs
These can be run in two ways, via salloc
and srun
. If you just want a single interactive session on a compute node then using srun
to allocate resources for a single task and launch a shell as that one task is probably the way to go. But if you want to run things in parallel or more than one task at once in your interactive job, use salloc
to allocate resources and then srun
or mpirun
to start the tasks, as starting multiple copies of an interactive shell at once probably isn’t what you want.
# One interactive task. Quit the shell to finish
srun --pty -u bash -i
# Allocate three tasks, followed by running three instances of 'myprog' within the allocation.
# Then start one copy of longprog and two copies of myprog2, then release the allocation
salloc -n3
srun myprog
srun -n1 longprog &
srun -n2 myprog2
exit
Batch Jobs
Scripts for batch jobs must start with the interpreter to be used to excute them (different from PBS/Torque). You can give arguments to sbatch as comments in the script. Example:
#!/bin/bash
# Name of the job
#SBATCH -J testjob
# Partition to use - generally not needed so commented out
##SBATCH -p NONIB
# time limit
#SBATCH --time=10:0:0
# Number of processes
#SBATCH -n1
# Start the program
srun myprogram
Asking for resources:
salloc
/srun
/sbatch
support a huge array of options which let you ask for nodes, cpus, tasks, sockets, threads, memory etc. If you combine them SLURM will try to work out a sensible allocation, so for example if you ask for 13 tasks and 5 nodes SLURM will cope. Here are the ones that are most likely to be useful:
Opiton | Meaning |
---|---|
-n |
Number of tasks (roughly, processes) |
-N |
Number of nodes to assign. If you’re using this, you might also be interested in –tasks-per-node |
--tasks-per-node |
Maximum tasks to assign per node if using -N |
--cpus-per-task |
Assign tasks containing more than one CPU. Useful for jobs with shared memory parallelization |
-C |
Features the nodes assigned must have |
-w |
Names of nodes that must be included – for selecting a particular node or nodes |
--mem-per-cpu |
Use this to make SLURM assign you more memory than the default amount available per CPU. The units are MB. Works by automatically assigning sufficient extra CPUs to the job to ensure it gets access to enough memory. |
MPI jobs
Inside a batch script you should just be able to call mpirun
, which will communicate with SLURM and launch the job over the appropriate set of nodes for you:
#!/bin/bash
# 13 tasks over 5 nodes
#SBATCH -n13 -N5
echo Hosts are
srun -l hostname
mpirun /home/cen1001/src/mpi_hello
To run MPI jobs interactively you can assign some nodes using salloc
, and then call mpirun
from inside that allocation. Unlike PBS/Torque, the shell you launch with salloc
runs on the same machine you ran salloc
on, not on the first node of the allocation. But mpirun
will do the right thing.
salloc -n12 bash
mpirun /home/cen1001/src/mpi_hello
You can even use srun
to launch MPI jobs interactively without mpirun
‘s intervention. The --mpi
option here is to tell srun
which method the MPI library uses for launching tasks. This is the correct one for use with our OpenMPI installations.
srun --mpi=pmi2 -n13 ./mpi_hello
OpenMP jobs
Since Nestum cluster is homogeneous Intel Xeon based parallel machine where each node has 32 compute cores with shared memory. The users can run parallel shared memory (OpenMP) jobs specifying one task -n 1
with maximum 32 cores per task -c NUMBER_OF_CORES_PER_TASK
.
#!/bin/bash
...
#SBATCH -n 1
#SBATCH -c NUMBER_OF_CORES_PER_TASK
...
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
srun myApp
Hybrid parallel jobs
If there is a need to run hybrid MPI and OpenMP parallel jobs, again you should specify the number or task’s -n NUMBER_OF_MPI_PROCESS
and number of OpenMP threads per MPI process (Number of cores per task) -c NUMBER_OF_CORES_PER_TASK
#!/bin/bash
...
#SBATCH -n NUMBER_OF_MPI_PROCESS
#SBATCH -c NUMBER_OF_CORES_PER_TASK
...
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
srun myApp
For example, if you want to run a hybrid parallel job on three computing nodes, utilizing all cores on each node with OpenMP threads, then NUMBER_OF_MPI_PROCESS = 3 and NUMBER_OF_CORES_PER_TASK = 32
#!/bin/bash
...
#SBATCH -n 3
#SBATCH -c 32
...
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
srun myApp
then the total number of threads will be 96, i.e. 3 (three) MPI processes each with 32 OpenMP threads.
Non-MPI Parallel jobs
In a parallel job which doesn’t use MPI you can find out which hosts you have and how many by running srun -l hostname
inside the job script. The -l
option will print the slurm task number next to the assigned hostname for the task, skip it if you want just the list of hostnames.
You can then use srun
inside the job to start individual tasks.