NVIDIA HPC Benchmarks#

NVIDIA HPC-Benchmarks collection provides four benchmarks (HPL, HPL-MxP, HPCG, and STREAM) widely used in the HPC community optimized for performance on NVIDIA accelerated HPC systems.

NVIDIA’s HPL and HPL-MxP benchmarks provide software packages to solve a (random) dense linear system in double precision (64-bit) arithmetic and in mixed precision arithmetic using Tensor Cores, respectively, on distributed-memory computers equipped with NVIDIA GPUs, based on the Netlib HPL benchmark and HPL-MxP benchmark.

NVIDIA’s HPCG benchmark accelerates the High Performance Conjugate Gradients (HPCG) Benchmark. HPCG is a software package that performs a fixed number of multigrid preconditioned (using a symmetric Gauss-Seidel smoother) conjugate gradient (PCG) iterations using double precision (64-bit) floating point values.

NVIDIA’s STREAM benchmark is a simple synthetic benchmark program that measures sustainable memory bandwidth.

NVIDIA HPC Benchmarks package includes STREAM benchmarks optimized for NVIDIA GPU and NVIDIA Grace CPU.

System Support#

This package is intended to support only:

  • CPU Architectures: x86_64 and NVIDIA Grace CPU (Arm SBSA).

  • GPU SM Architectures: NVIDIA Ampere GPU architecture (sm80), NVIDIA Hopper GPU architecture (sm90), and NVIDIA Blackwell GPU architecture (sm100).

  • Linux distributions with glibc >= 2.28 – RHEL 8.8, SLES 15.5, and Ubuntu 22.04 and 24.04 have been tested.

  • MPI libraries that are ABI-compatible with MPICH (e.g., MPICH, Cray MPICH, MVAPICH, etc.) and OpenMPI.

The x86_64 package includes only GPU benchmarks, while the Arm SBSA package offers both GPU and NVIDIA Grace CPU benchmarks.

Prerequisites#

  • CUDA 12.8 or newer

  • OpenMPI 4.1 or newer, or MPICH 3.4 or newer

NVIDIA HPC Benchmarks structure#

In addition to NVIDIA HPC benchmarks, the package contains:

  • NVIDIA NCCL 2.25.1

  • AWS OFI NCCL 1.6.0

  • NVIDIA NVSHMEM 3.2.5

  • NVIDIA GDR Copy 2.4

  • NVIDIA NVPL BLAS 25.1 (Arm SBSA only)

  • NVIDIA NVPL LAPACK 25.1 (Arm SBSA only)

  • NVIDIA NVPL Sparse 25.1 (Arm SBSA only)

  • LLVM OpenMP 18.1.1 (Arm SBSA only)

  • TCMalloc 4.5.3 (Arm SBSA only)

Index#

Support#

For questions or to provide feedback, please contact HPCBenchmarks@nvidia.com

Resources#