Can you give me an example of a Slurm job script?

This KB Article References: High Performance Computing

This Information is Intended for: Instructors, Researchers, Staff, Students
Created: 09/30/2019 Last Updated: 11/15/2024
 
 

This is an example slurm job script for 28-core queues:

#!/bin/bash
#
#SBATCH --job-name=test
#SBATCH --output=res.txt
#SBATCH --ntasks-per-node=28
#SBATCH --nodes=2
#SBATCH --time=05:00
#SBATCH -p short-28core
#SBATCH --mail-type=BEGIN,END
#SBATCH --mail-user=jane.smith@stonybrook.edu

module load intel/oneAPI/2022.2
module load compiler/latest mpi/latest mkl/latest

cd /gpfs/projects/samples/intel_mpi_hello/
mpiicc mpi_hello.c -o intel_mpi_hello

mpirun ./intel_mpi_hello

This job will utilize 2 nodes, with 28 CPUs per node for 5 minutes in the short-28core queue to run the intel_mpi_hello script. 

If we named this script "test.slurm", we could submit the job using the following command:

sbatch test.slurm

And the output (in res.txt) would look like this:

Hello world from processor sn088, rank 0 out of 56 processors
Hello world from processor sn111, rank 28 out of 56 processors
Hello world from processor sn088, rank 1 out of 56 processors
Hello world from processor sn111, rank 29 out of 56 processors
Hello world from processor sn088, rank 2 out of 56 processors
Hello world from processor sn111, rank 30 out of 56 processors
Hello world from processor sn088, rank 5 out of 56 processors
Hello world from processor sn111, rank 31 out of 56 processors
...

The processor names (sn088 and sn111) will vary depending on which two nodes the job is using.

This is an example slurm job script for GPU queues:

#!/bin/bash
#
#SBATCH --job-name=test-gpu
#SBATCH --output=res.txt
#SBATCH --ntasks-per-node=28
#SBATCH --nodes=2
#SBATCH --time=05:00
#SBATCH -p gpu
#SBATCH --mail-type=BEGIN,END
#SBATCH --mail-user=jane.smith@stonybrook.edu

module load anaconda/3
module load cuda102/toolkit/10.2
module load cudnn/7.4.5

source activate tensorflow2-gpu


cd /gpfs/projects/samples/tensorflow

python tensor_hello3.py

Breakdown:

The directive 

#SBATCH --job-name=test-gpu

gives the name "test-gpu" to your job.

The directives 

#SBATCH --ntasks-per-node=28
#SBATCH --nodes=2
#SBATCH --time=05:00

indicate that we are requesting 2 nodes, and we will run 28 tasks per node for 5 minutes.

The directive 

#SBATCH -p gpu

indicates to the batch scheduler that you want to use the GPU queue.

The mail-related directives

#SBATCH --mail-type=BEGIN,END
#SBATCH --mail-user=jane.smith@stonybrook.edu

control whether (and when) the user should be notified via email of changes to the job state. In this example, the --mail-type=BEGIN,END indicates that an email should be sent to the user when the job starts and when it finishes. 

Other useful mail-type options include:

  • FAIL (email upon job failure)
  • ALL (email for all state changes).

Note that emails will only be sent to "stonybrook.edu" addresses.

 

All of these directives are passed straight to the sbatch command, so for a full list of options just take a look at the sbatch manual page by issuing the command:

man sbatch

 

For more information on SLURM, please also see the official documentation.

 

Article Topic