SLURM Overview

What is SLURM?

SLURM (Simple Linux Utility for Resource Management) is an open-source workload manager and job scheduler used on SeaWulf to manage compute resources and run jobs efficiently across the cluster.

Getting Started

Before using SLURM commands on SeaWulf, load the SLURM module:

module load slurm

Essential SLURM Commands

Here are the core commands you'll use to submit and manage jobs:

Function	Command	Description
Submit batch job	`sbatch [script]`	Submit a job script to the queue (see Writing Job Scripts)
Interactive job	`srun --pty bash`	Start an interactive session on a compute node (see Interactive Jobs)
Check job status	`squeue`	View current job queue and status
Check your jobs	`squeue --user=$USER`	View only your jobs
Cancel job	`scancel [job_id]`	Cancel a running or queued job
Job details	`scontrol show job [job_id]`	Show detailed job configuration and runtime info
Job history	`sacct`	View completed job information (if enabled)
Node information	`sinfo`	Display node and partition information

Tip: Use sbatch for batch jobs (non-interactive, queued execution) and srun for interactive sessions (real-time access to compute nodes).

Quick Examples

Check your running jobs:

squeue --user=$USER

View details about a specific job:

scontrol show job 123456

See available partitions and nodes:

sinfo

Job Script Basics

SLURM jobs are defined using job scripts that specify resource requirements and commands to run. Job scripts contain special directives that begin with #SBATCH.

Essential SLURM Directives

Resource	Directive	Example
Job name	`#SBATCH --job-name=`	`--job-name=my_job`
Number of nodes	`#SBATCH --nodes=`	`--nodes=2`
Tasks per node	`#SBATCH --ntasks-per-node=`	`--ntasks-per-node=40`
CPUs per task	`#SBATCH --cpus-per-task=`	`--cpus-per-task=2`
Memory per node	`#SBATCH --mem=`	`--mem=64GB`
Wall time	`#SBATCH --time=`	`--time=02:30:00`
Partition/Queue	`#SBATCH -p`	`-p short-40core`
Output file	`#SBATCH --output=`	`--output=job_%j.out`
Error file	`#SBATCH --error=`	`--error=job_%j.err`

Note: The %j placeholder in filenames is automatically replaced with the job ID, making it easy to track output from multiple jobs.
Tip: -n specifies the number of tasks (e.g., MPI processes), while -c specifies the number of CPU cores per task (e.g., threads for OpenMP).

For a complete example of a well-structured job script, see the Example Jobs page.

Useful Environment Variables

SLURM automatically sets several environment variables that your jobs can use:

Variable	Description
`$SLURM_JOBID`	Unique job identifier
`$SLURM_SUBMIT_DIR`	Directory where job was submitted from
`$SLURM_JOB_NODELIST`	List of nodes allocated to the job
`$SLURM_NTASKS`	Total number of tasks for the job
`$SLURM_CPUS_PER_TASK`	Number of CPUs allocated per task
`$SLURM_JOB_NAME`	Name of the job

Example usage in a script:

echo "Job ID: $SLURM_JOBID"
echo "Running on nodes: $SLURM_JOB_NODELIST"
cd $SLURM_SUBMIT_DIR

Best Practices

Resource Estimation

Request only the resources you actually need. Over-requesting resources can lead to longer queue times and reduced system efficiency.

Output Files

Always specify output and error files to capture job information. Use %j to include the job ID in filenames:

#SBATCH --output=job_%j.out
#SBATCH --error=job_%j.err

Module Loading

Load all required modules within your job script to ensure consistent environments across compute nodes. Always start with module purge to avoid conflicts.

Test First

Test your scripts with small resource requests first. Use interactive sessions to debug before submitting large batch jobs.

Need Help?

For detailed SLURM documentation, visit the official SLURM documentation. For SeaWulf-specific questions, submit a ticket to the IACS support system.

SLURM Overview & Commands

What is SLURM?

Getting Started

Essential SLURM Commands

Quick Examples

Job Script Basics

Essential SLURM Directives

Useful Environment Variables

Best Practices

Resource Estimation

Output Files

Module Loading

Test First

Need Help?