This KB Article References: High Performance Computing
This is an example slurm job script for 28-core queues:
#!/bin/bash # #SBATCH --job-name=test #SBATCH --output=res.txt #SBATCH --ntasks-per-node=28 #SBATCH --nodes=2 #SBATCH --time=05:00 #SBATCH -p short-28core #SBATCH --mail-type=BEGIN,END #SBATCH --mail-user=jane.smith@stonybrook.edu module load intel/oneAPI/2022.2 module load compiler/latest mpi/latest mkl/latest cd /gpfs/projects/samples/intel_mpi_hello/ mpiicc mpi_hello.c -o intel_mpi_hello mpirun ./intel_mpi_hello
This job will utilize 2 nodes, with 28 CPUs per node for 5 minutes in the short-28core queue to run the intel_mpi_hello script.
If we named this script "test.slurm", we could submit the job using the following command:
sbatch test.slurm
And the output (in res.txt) would look like this:
Hello world from processor sn088, rank 0 out of 56 processors Hello world from processor sn111, rank 28 out of 56 processors Hello world from processor sn088, rank 1 out of 56 processors Hello world from processor sn111, rank 29 out of 56 processors Hello world from processor sn088, rank 2 out of 56 processors Hello world from processor sn111, rank 30 out of 56 processors Hello world from processor sn088, rank 5 out of 56 processors Hello world from processor sn111, rank 31 out of 56 processors ...
The processor names (sn088 and sn111) will vary depending on which two nodes the job is using.
This is an example slurm job script for GPU queues:
#!/bin/bash # #SBATCH --job-name=test-gpu #SBATCH --output=res.txt #SBATCH --ntasks-per-node=28 #SBATCH --nodes=2 #SBATCH --time=05:00 #SBATCH -p gpu #SBATCH --mail-type=BEGIN,END #SBATCH --mail-user=jane.smith@stonybrook.edu module load anaconda/3 module load cuda102/toolkit/10.2 module load cudnn/7.4.5 source activate tensorflow2-gpu cd /gpfs/projects/samples/tensorflow python tensor_hello3.py
Breakdown:
The directive
#SBATCH --job-name=test-gpu
gives the name "test-gpu" to your job.
The directives
#SBATCH --ntasks-per-node=28 #SBATCH --nodes=2 #SBATCH --time=05:00
indicate that we are requesting 2 nodes, and we will run 28 tasks per node for 5 minutes.
The directive
#SBATCH -p gpu
indicates to the batch scheduler that you want to use the GPU queue.
The mail-related directives
#SBATCH --mail-type=BEGIN,END #SBATCH --mail-user=jane.smith@stonybrook.edu
control whether (and when) the user should be notified via email of changes to the job state. In this example, the --mail-type=BEGIN,END indicates that an email should be sent to the user when the job starts and when it finishes.
Other useful mail-type options include:
- FAIL (email upon job failure)
- ALL (email for all state changes).
Note that emails will only be sent to "stonybrook.edu" addresses.
All of these directives are passed straight to the sbatch command, so for a full list of options just take a look at the sbatch manual page by issuing the command:
man sbatch
For more information on SLURM, please also see the official documentation.