NVIDIA HPC Benchmarks#
NVIDIA HPC-Benchmarks collection provides four benchmarks (HPL, HPL-MxP, HPCG, and STREAM) widely used in the HPC community optimized for performance on NVIDIA accelerated HPC systems.
NVIDIA’s HPL and HPL-MxP benchmarks provide software packages to solve a (random) dense linear system in double precision (64-bit) arithmetic and in mixed precision arithmetic using Tensor Cores, respectively, on distributed-memory computers equipped with NVIDIA GPUs, based on the Netlib HPL benchmark and HPL-MxP benchmark.
NVIDIA’s HPCG benchmark accelerates the High Performance Conjugate Gradients (HPCG) Benchmark. HPCG is a software package that performs a fixed number of multigrid preconditioned (using a symmetric Gauss-Seidel smoother) conjugate gradient (PCG) iterations using double precision (64-bit) floating point values.
NVIDIA’s STREAM benchmark is a simple synthetic benchmark program that measures sustainable memory bandwidth.
NVIDIA HPC Benchmarks package includes STREAM benchmarks optimized for NVIDIA GPU and NVIDIA Grace CPU.
System Support#
This package is intended to support only:
CPU Architectures: x86_64 and NVIDIA Grace CPU (Arm SBSA).
GPU SM Architectures: NVIDIA Ampere GPU architecture (sm80), NVIDIA Hopper GPU architecture (sm90), and NVIDIA Blackwell GPU architecture (sm100).
Linux distributions with glibc >= 2.28 – RHEL 8.8, SLES 15.5, and Ubuntu 22.04 and 24.04 have been tested.
MPI libraries that are ABI-compatible with MPICH (e.g., MPICH, Cray MPICH, MVAPICH, etc.) and OpenMPI.
The x86_64 package includes only GPU benchmarks, while the Arm SBSA package offers both GPU and NVIDIA Grace CPU benchmarks.
Prerequisites#
CUDA 12.8 or newer
OpenMPI 4.1 or newer, or MPICH 3.4 or newer
NVIDIA HPC Benchmarks structure#
In addition to NVIDIA HPC benchmarks, the package contains:
NVIDIA NCCL 2.25.1
AWS OFI NCCL 1.6.0
NVIDIA NVSHMEM 3.2.5
NVIDIA GDR Copy 2.4
NVIDIA NVPL BLAS 25.1 (Arm SBSA only)
NVIDIA NVPL LAPACK 25.1 (Arm SBSA only)
NVIDIA NVPL Sparse 25.1 (Arm SBSA only)
LLVM OpenMP 18.1.1 (Arm SBSA only)
TCMalloc 4.5.3 (Arm SBSA only)
Index#
Support#
For questions or to provide feedback, please contact HPCBenchmarks@nvidia.com