Example 1: Serial Job (Hello World)
A simple single-threaded job for testing or running small scripts.
Create hello.slurm
:
#!/bin/bash #SBATCH -p short-40core-shared #SBATCH -N 1 #SBATCH -n 1 #SBATCH --mem 5G #SBATCH -t 00:05:00 #SBATCH -o hello.out echo "Hello from SeaWulf!"
Submit the job:
sbatch hello.slurm
Explanation: This is a single-threaded job, so only 1 task is requested on a shared partition. Memory is set modestly (5 GB) because a serial program uses minimal resources. This is typical for quick tests or small scripts.
Example 2: Python Job
Running a Python script that may use multiple cores through libraries.
Create python_job.slurm
:
#!/bin/bash #SBATCH -p short-40core #SBATCH -N 1 #SBATCH -n 40 #SBATCH -t 00:10:00 #SBATCH -o python.out module load anaconda conda activate my-environment python script.py
Submit with:
sbatch python_job.slurm
Explanation: Python scripts are typically single-threaded, so requesting all 40 cores may be unnecessary unless your script uses multi-threaded libraries like NumPy, PyTorch, or Dask. If your script is single-threaded, you could run it on one core on a shared partition to save resources.
Example 3: MPI Job
Run a parallel MPI program across multiple nodes.
Create mpi_job.slurm
:
#!/bin/bash #SBATCH -p short-40core #SBATCH -N 2 #SBATCH -n 80 #SBATCH -t 00:10:00 #SBATCH -o mpi.out module load openmpi mpirun ./my_mpi_program
Submit with:
sbatch mpi_job.slurm
Explanation: MPI programs run multiple processes in parallel. Here, we request 2 nodes with 40 tasks per node to fully utilize all cores on both nodes. This setup is suitable for distributed computation where each MPI rank performs part of the workload.
Example 4: OpenMP Job
Run an OpenMP program on a single node using all cores.
Create openmp_job.slurm
:
#!/bin/bash #SBATCH -p short-40core #SBATCH -N 1 #SBATCH -n 1 #SBATCH -c 40 #SBATCH -t 00:10:00 #SBATCH -o openmp.out export OMP_NUM_THREADS=40 ./my_openmp_program
Submit with:
sbatch openmp_job.slurm
Explanation: OpenMP uses threads instead of separate MPI processes. We request 1 task and assign 40 cores with -c 40
. The environment variable OMP_NUM_THREADS
ensures the program uses all cores on the node efficiently.
Example 5: GPU Job
Run a CUDA-enabled program on a GPU node.
Create gpu_job.slurm
:
#!/bin/bash #SBATCH -p a100 #SBATCH -N 1 #SBATCH -n 1 #SBATCH --gres=gpu:1 #SBATCH -t 00:30:00 #SBATCH -o gpu.out module load cuda120/toolkit/12.0 ./my_gpu_program
Submit with:
sbatch gpu_job.slurm
Explanation: GPU jobs request a GPU with --gres=gpu:1
. Only one CPU task is needed for launching the program, but additional threads can be requested if the GPU program also uses CPU cores. This setup is for GPU-accelerated workloads like deep learning or CUDA computations.
Example 6: Array Job
Run multiple similar jobs with different parameters.
Create array_job.slurm
:
#!/bin/bash #SBATCH --job-name=parameter_sweep #SBATCH --array=1-100 #SBATCH --nodes=1 #SBATCH --ntasks-per-node=40 #SBATCH --time=02:00:00 #SBATCH -p short-40core #SBATCH -o array_%A_%a.out module load python/3.9 cd $SLURM_SUBMIT_DIR python simulation.py --param-set $SLURM_ARRAY_TASK_ID
Submit with:
sbatch array_job.slurm
Explanation: Array jobs create multiple job instances from a single script. The --array=1-100
directive creates 100 jobs, each with a unique $SLURM_ARRAY_TASK_ID
value. This is ideal for parameter sweeps, Monte Carlo simulations, or processing many input files. The %A
in output filenames represents the array job ID, and %a
represents the individual task ID.
Quick Reference: When to Use Each Type
Job Type | Use When | Key Directives |
---|---|---|
Serial | Single-threaded programs, testing, small scripts | -n 1 on shared partition |
Python | Python scripts with multi-threaded libraries | -n 40 or -n 1 depending on libraries |
MPI | Distributed parallel programs across nodes | -N 2 -n 80 |
OpenMP | Shared-memory threading on single node | -n 1 -c 40 |
GPU | CUDA or GPU-accelerated applications | --gres=gpu:1 |
Array | Parameter sweeps, multiple similar runs | --array=1-100 |
- Always include
#SBATCH
lines for partition, nodes, tasks, time, and output - Check your output files (
.out
) for results - Load necessary modules (e.g.,
anaconda/3
,cuda
,openmpi
) within your scripts - Adjust resources according to program type to efficiently utilize nodes