The Nestum cluster use SLURM batch system for controlling user jobs. One good introduction using slurm can be found here. Convenient slurm commands are listed below. A Postfix client, installed on the cluster, will notify you of the status of your jobs via email. You should note that by default the number of cores used for your job is 1. Also, the default memory allocated for your job is 2GB. If your job uses more than that, you’ll get an error that your job
Exceeded job memory limit. To set a larger (or smaller) limit, add to your job submission:
#SBATCH --mem-per-cpu X
where X is the default amount of memory in MB if not specified. This is especially important for array jobs. In addition there is several environment variables associated with particular slurm job.
- SLURM_JOB_ID – each slurm job (or task) has unique id reserved at submission. The values is stored in SLURM_JOB_ID environment variable
- SLURM_TMPDIR – For each slurm job there is temporal folder associated with maximal size up to 110 GB per compute node. Please keep in case when your job doesn’t occupy entire compute node temporal storage my be shared with another slurm job.
- SLURM_NPROCS – number of reserved computational cores for the job
You can have interactive jobs with or without X11 support. In order to have an interactive job without X11 support (interactive batch job) you can simply run following command:
srun --pty /bin/bash
In case you need to reserve more than one compute core, you can use additional option
-n. For example in order to run interactive job with 16 reserved compute cores you can execute command:
srun -n16 --pty /bin/bash
If you need interactive jobs with X11 for GUI enabled applications you should add following parameter
--x11=first to the
srun --pty --x11=first xterm
or, first start interactive shell session with X11 forwarding enabled and then run the application such as: xterm, gnuplot, etc.
srun --pty --x11=first /bin/bash xterm
The batch job’s in slurm are submitted with
sbatch command and they represent simple shell script files with additional parameters passed to
sbatch escaped with #SBATCH. For example lets consider following bash shell script my.job.
#!/bin/bash # #SBATCH -p medium.p # partition (queue) #SBATCH -N 2 # number of nodes #SBATCH -n 64 # number of tasks #SBATCH -t 0-2:00 # time (D-HH:MM) #SBATCH -o slurm.%N.%j.out # STDOUT #SBATCH -e slurm.%N.%j.err # STDERR #SBATCH --mail-type=<type> # notification trigger #SBATCH --mail-user=<user> # email address module load openmpi mpirun helloworld.x
Although the command line parameters in above script are self explanatory and well document in
sbatch man page. Lets have a few words for each option
#SBATCH -p medium.p # partition (queue)
set the partition (queue) in which job will be submitted. If this option is omitted default queue is used. Next option
#SBATCH -N 2 # number of nodes
set the number of nodes which will be allocated for the job in our case they are 2 in addition we need to set and total number of compute cores
#SBATCH -n 64 # number of cores
set the number of task’s to be executed. Since the default number of cpus-per-task is 1 and each of requested 2 compute has 32 cores the total number if 64 cores will be allocated. Th execution time is specified with -t option
#SBATCH -t 0-2:00 # time (D-HH:MM)
if this valued is omitted default value is 10 minutes. And finally the standard output and standard error stream can be redirected into files slurm.%N.%j.out and slurm.%N.%j.out respectively where %N represent the Node id and %j is a task id.
#SBATCH --mail-type=<type> #SBATCH --mail-user=<user>
these lines set the rules for email notification. Setting these is not compulsory. The can be ALL, BEGIN, END and FAIL, which are self-explanatory. The field should be the desired email address. If left blank, notifications will be send on the email address associated with the user account.
Submitting job is quite simple
sbatch -t 0-3:00 my.job
in above example the command line option -t will override the option in job file.
Running hybrid parallel MPI+OpenMPI jobs will require additional
sbatch option cpus-per-task and the environment variable OMP_NUM_THREADS need to set equal to SLURM_CPUS_PER_TASK. For example let’s say we want to run 4 MPI processes with 16 threads each in parallel, here is an example my.job
#!/bin/bash # #SBATCH -n 4 # number of tasks #SBATCH --cpus-per-task 16 # number of cpu per task #SBATCH -t 0-2:00 # time (D-HH:MM) #SBATCH -o slurm.%j.out # STDOUT #SBATCH -e slurm.%j.err # STDERR #SBATCH --mail-type=<type> # notification trigger #SBATCH --mail-user=<user> # email address module load openmpi export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK mpirun ./myhybridparallel.x
In case you need to submit serial batch job which will occupy only one core it’s important to specify only the
-n 1 option
#!/bin/bash # #SBATCH -n 1 # number of tasks #SBATCH -t 0-2:00 # time (D-HH:MM) #SBATCH -o slurm.%j.out # STDOUT #SBATCH -e slurm.%j.err # STDERR #SBATCH --mail-type=<type> # notification trigger #SBATCH --mail-user=<user> # email address # load any necessary software module if it's required ./myprog.x
Job arrays offer a mechanism for submitting and managing collections of similar jobs quickly and easily. All jobs must have the same initial options (e.g. size, time limit, etc.).
Job arrays are only supported for batch jobs and the array index values are specified using the
-a option of the
sbatch command. The option argument can be specific array index values, a range of index values, and an optional step size as shown in the examples below. Note that the minimum index value is zero and the maximum value is a Slurm configuration parameter (
MaxArraySize minus one). Jobs which are part of a job array will have the environment variable
SLURM_ARRAY_TASK_ID set to its array index value.
# Submit a job array with index values between 0 and 31 $ sbatch --array=0-31 -N1 tmp # Submit a job array with index values of 1, 3, 5 and 7 $ sbatch --array=1,3,5,7 -N1 tmp # Submit a job array with index values between 1 and 7 # with a step size of 2 (i.e. 1, 3, 5 and 7) $ sbatch --array=1-7:2 -N1 tmp
A maximum number of simultaneously running tasks from the job array may be specified using a “%” separator. For example
--array=0-15%4 will limit the number of simultaneously running tasks from this job array to 4.
In order to list all running job’s you can use the command
squeue without any additional arguments:
if you like to get the running job’s of particular user use command switch -c
squeue -u user
In case you need to cancel running job. First obtain job id using
squeue command, then use command
The state of computational nodes in slurm environment can be reviewed using command