NVHPC (NVIDIA HPC SDK)

NVIDIA HPC SDK on SeaWulf

This KB Article References: High Performance Computing
This Information is Intended for: Instructors, Researchers, Staff, Students
Created: 03/07/2025
Last Updated: 03/07/2025

Introduction to NVIDIA HPC SDK on SeaWulf

The NVIDIA HPC SDK (NVHPC) is a comprehensive suite of compilers, libraries, and tools designed specifically for high-performance computing on NVIDIA GPU-accelerated systems. This toolkit provides optimized solutions for developing applications that can effectively leverage the computational power of NVIDIA GPUs in the SeaWulf computing environment.

NVHPC is built on NVIDIA's industry-leading compiler technology and includes support for C, C++, and Fortran programming languages. It offers specialized optimizations for GPU acceleration, parallel computing, and scientific computing, making it an essential tool for researchers and developers working on computationally intensive applications on GPU nodes in the SeaWulf cluster.

Available NVHPC Versions

SeaWulf currently provides the following NVHPC versions:

  • nvidia/nvhpc/21.5 - Compatible with CUDA 11.x
  • nvidia/nvhpc/21.7 - Compatible with CUDA 11.x
  • nvidia/nvhpc/23.7 - Compatible with CUDA 11.x and 12.x
  • nvidia/nvhpc/23.11 - Compatible with CUDA 12.x
  • nvidia/nvhpc/24.11 - Latest version (recommended)

To load a specific NVHPC version, use the module command:

module load nvidia/nvhpc/24.11

Note: There are also specialized versions of NVHPC with different MPI implementations or without MPI (nompi). Choose the appropriate module based on your parallel programming needs.

NVHPC Variants

In addition to the standard NVHPC modules, SeaWulf offers several specialized variants:

  • nvidia/nvhpc-nompi - NVHPC without MPI support
  • nvidia/nvhpc-hpcx - NVHPC with Mellanox HPC-X MPI
  • nvidia/nvhpc-openmpi3 - NVHPC with OpenMPI 3.x
  • nvidia/nvhpc-byo-compiler - "Bring Your Own Compiler" variant
  • nvidia/nvhpc-hpcx-cuda11 - Specific for CUDA 11.x compatibility
  • nvidia/nvhpc-hpcx-cuda12 - Specific for CUDA 12.x compatibility

Example usage:

module load nvidia/nvhpc-hpcx-cuda12/23.11

Important: Ensure the NVHPC variant you choose is compatible with your CUDA requirements and parallel programming model.

NVHPC Compilers

The NVIDIA HPC SDK provides several compilers optimized for NVIDIA GPUs:

  • C Compiler: nvc
  • C++ Compiler: nvc++
  • Fortran Compiler: nvfortran

Example usage:

nvc myprogram.c -o myprogram # Compile a C program
nvc++ myprogram.cpp -o myprogram # Compile a C++ program
nvfortran myprogram.f90 -o myprogram # Compile a Fortran program

GPU Acceleration with NVHPC

NVHPC provides multiple programming models for GPU acceleration:

  • OpenACC: A directive-based approach for GPU programming
  • CUDA: NVIDIA's parallel computing platform
  • OpenMP: Standard API for parallel programming with GPU target support
  • Standard Language Features: C++17 parallel algorithms, Fortran DO CONCURRENT

Example OpenACC compilation:

nvc -acc -Minfo=accel myprogram.c -o myprogram

Example CUDA compilation:

nvc -cuda -Minfo=accel mycudaprogram.cu -o mycudaprogram

Note: The -Minfo=accel flag provides detailed information about the accelerator code generated by the compiler.

MPI Integration with NVHPC

For parallel programming with MPI, NVHPC provides compiler wrappers that automatically include the necessary MPI libraries and flags:

MPI Wrappers:

  • mpicc: for C programs
  • mpicxx / mpic++: for C++ programs
  • mpifort: for Fortran programs

Example SLURM script with NVHPC and MPI:

#!/bin/bash
#SBATCH --job-name=nvhpc_test
#SBATCH --output=nvhpc_test.out
#SBATCH -p a100
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=2
#SBATCH --gres=gpu:2
#SBATCH --time=01:00:00

# Load necessary modules
module load nvidia/nvhpc-hpcx-cuda12/23.11

# Compile and run your code
mpicc -acc=gpu -Minfo=accel mpi_gpu_example.c -o mpi_gpu_example
mpirun -np $SLURM_NTASKS ./mpi_gpu_example

Optimization Tips for NVHPC

To maximize performance with NVHPC on GPU nodes in SeaWulf, consider the following optimization strategies:

  • Basic Optimization Flags:
    # High optimization level
    nvc -O3 myprogram.c -o myprogram
  • GPU Targeting:
    # Target specific GPU architecture
    nvc -acc=gpu -gpu=cc80 myprogram.c -o myprogram

    Note: Common compute capability values include cc70 (Volta), cc75 (Turing), cc80 (Ampere), and cc90 (Hopper). Check your GPU architecture and use the appropriate value.

  • Memory Usage Optimization:
    # Managed memory model for simplicity
    nvc -acc=gpu -gpu=managed myprogram.c -o myprogram
  • Profiling Support:
    # Enable profiling with NVIDIA tools
    nvc -acc=gpu -gpu=lineinfo myprogram.c -o myprogram
  • Math Library Optimization:
    # Link with NVIDIA math libraries
    nvc -acc=gpu myprogram.c -o myprogram -lcublas -lcufft

NVHPC Tools

The NVIDIA HPC SDK includes several tools to help optimize and debug your GPU-accelerated applications:

  • NVIDIA Nsight Systems: System-wide performance analysis tool
  • NVIDIA Nsight Compute: Interactive kernel profiler
  • NVIDIA Debugger (cuda-gdb): For debugging GPU applications
  • NVTOP: Interactive GPU process monitor (available as separate modules: nvtop/3.0.1 and nvtop/3.1.0)

Example usage:

nsys profile ./myprogram # Profile application with Nsight Systems
module load nvtop/3.1.0
nvtop # Monitor GPU usage interactively

NVHPC Version Comparison

Feature NVHPC 21.x NVHPC 23.x NVHPC 24.11
Primary CUDA Support CUDA 11.x CUDA 11.x, 12.x CUDA 12.x
GPU Architecture Support Up to Ampere Up to Hopper Up to Hopper
C++ Standard Support C++17 C++17/20 C++20
Fortran Standard Support Fortran 2003/2008 Fortran 2008/2018 Fortran 2018
Recommended for Legacy code General use New projects, best performance

Resources and Documentation

For detailed information on NVIDIA HPC SDK features, optimization techniques, and programming guides, refer to the following resources:

For SeaWulf-specific questions and support with NVIDIA HPC SDK, please contact the SeaWulf support team.