SeaWulf provides several GPU-accelerated nodes across Haswell, Skylake, and AMD Milan architectures with NVIDIA A100, P100, V100, or K80 GPUs. These nodes are optimized for GPU-accelerated workloads in areas such as AI, molecular dynamics, and image processing.
Note: All shared queues are accessed from milan1/milan2. The main benefit of shared queues is shorter wait times and the ability to request exactly the resources your job requires.
Available GPU Queues
Haswell Nodes (login1/login2 access)
Note: The gpu partition uses K80 GPUs.
| Queue | CPU Architecture | Vector/Matrix Extension | CPU Cores per Node | GPUs per Node | Node Memory | Default Runtime | Max Runtime | Max Nodes | Max Jobs per User | Multi-User |
|---|---|---|---|---|---|---|---|---|---|---|
| gpu | Intel Haswell | AVX2 | 28 | 4 | 128 GB | 1 hr | 8 hrs | 2 | 2 | No |
| gpu-long | Intel Haswell | AVX2 | 28 | 4 | 128 GB | 8 hrs | 48 hrs | 1 | 2 | No |
| gpu-large | Intel Haswell | AVX2 | 28 | 4 | 128 GB | 1 hr | 8 hrs | 4 | 1 | No |
| p100 | Intel Haswell | AVX2 | 12 | 2 | 64 GB | 1 hr | 24 hrs | 1 | 1 | No |
| v100 | Intel Haswell | AVX2 | 28 | 2 | 128 GB | 1 hr | 24 hrs | 1 | 1 | No |
A100 Nodes (milan1/milan2 access)
| Queue | CPU Architecture | Vector/Matrix Extension | CPU Cores per Node | GPUs per Node | Node Memory | Default Runtime | Max Runtime | Max Nodes | Max Jobs per User | Multi-User |
|---|---|---|---|---|---|---|---|---|---|---|
| a100 | AMD Milan | AVX2 | 96 | 4 | 256 GB | 1 hr | 8 hrs | 2 | 2 | Yes |
| a100-long | AMD Milan | AVX2 | 96 | 4 | 256 GB | 8 hrs | 48 hrs | 1 | 2 | Yes |
| a100-large | AMD Milan | AVX2 | 96 | 4 | 256 GB | 1 hr | 8 hrs | 4 | 1 | Yes |
Accessing GPU Nodes
Submit GPU jobs using the SLURM workload manager. Load the slurm module before submitting:
module load slurm
sbatch job_script.sh
Example interactive session:
srun -J myjob -N 1 -p a100 --gpus=1 --pty bash
Example batch script:
#!/bin/bash
#SBATCH --job-name=gpu_test
#SBATCH --output=res.txt
#SBATCH -p a100
#SBATCH --gpus=1
#SBATCH --time=02:00:00
module load cuda120/toolkit/12.0
nvcc mycode.cu -o mycode
./mycode
Using CUDA for GPU Acceleration
To compile and run GPU-accelerated code, load the appropriate CUDA toolkit:
# For K80, P100, and V100 nodes
module load cuda113/toolkit/11.3
# For A100 nodes
module load cuda120/toolkit/12.0
Compile with nvcc:
nvcc input.cu -o output
Sample CUDA program available at: /gpfs/projects/samples/cuda/test.cu
Monitoring GPU Usage
Monitor GPU performance during jobs to ensure efficient utilization:
nvidia-smi
module load nvtop
nvtop
- nvidia-smi: Displays GPU utilization, memory usage, and active processes.
- nvtop: Interactive, real-time GPU monitoring tool similar to
htop.
Best Practices
- Always request GPUs explicitly using
#SBATCH --gpus=[number]. - Request memory with
#SBATCH --mem=[amount]to stay within node limits. - Avoid running compute workloads on login nodes.
- Monitor usage regularly and release resources promptly after jobs finish.
Note on NVwulf Access
If your workloads require additional GPU capacity or dedicated access, you may request access to the NVwulf cluster through the HPC portal. NVwulf provides additional GPU nodes for high-demand or long-running jobs.
Getting Access to NVwulf
Getting Access to NVwulf
