NVIDIA Performance Libraries#

The NVIDIA Performance Libraries (NVPL) are a collection of high-performance mathematical libraries optimized for NVIDIA aarch64 CPUs.

The NVPL libraries are CPU-only and have no dependencies on CUDA or CTK. They are drop-in replacements for standard C and Fortran mathematical APIs, allowing HPC applications to achieve maximum performance on NVIDIA CPU platforms.

Libraries Documentation#

Installation#

Current release: NVPL-26.5
NVPL Downloads
See: Install NVPL

System Support#

Architecture: aarch64-linux
Platform: Arm SBSA

CPU Support#

NVIDIA Vera (Armv9.2-A Olympus)
NVIDIA Grace (Armv9.0-A Neoverse-V2)
NVIDIA DGX Spark / GB10 (Armv9.2-A Cortex-X925/A725)
AWS Graviton 5 (Armv9.2-A Neoverse-V3)
AWS Graviton 4 (Armv9.0-A Neoverse-V2)
AWS Graviton 3/3e (Armv8.4-A Neoverse-V1)
AWS Graviton 2 (Armv8.2-A Neoverse-N1)
Ampere Altra (Armv8.2-A Neoverse-N1)
Any CPU with Armv8.1-A or later architecture

OS Support#

The following OS versions are tested for all combinations of compiler, OpenMP, and MPI support. Generally any Linux OS for aarch64 should also be supported. If a system package is not provided for your OS, use the tarball distributables.

Amazon Linux: 2 (*EOL), 2023
Debian: 12, 13
Fedora: 42, 43
RHEL: RHEL8 (8.10), RHEL9 (9.7), RHEL10 (10.1)
openSUSE/Leap: 15.6, 16.0
SLES: SLES15 (15.6), SLES16 (16.0)
Ubuntu: 20.04, 22.04, 24.04, 26.04
Minimum glibc supported: 2.26 for NVPL 26.5. The next NVPL release will support systems with glibc 2.28 or newer. See Release Notes.

Compiler Support#

GCC-8 - GCC-16+
Clang-14 - Clang-22+
Clang for NVIDIA Grace: 16.x – 21.x
NVIDIA HPC Compilers: 23.9 – 26.3+

Language Support#

C: All libraries
C++: All libraries via C interfaces
Fortran: Selected libraries
- GFortran ABI
- NVPL BLAS, LAPACK, and ScaLAPACK provide lp64 and ilp64 integer ABIs
- See individual libraries documentation for further details

Python#

Python packages with binary redistribution of NVPL libraries are available on both PyPI and conda-forge.

NVPL Python Usage - Python wheel packages and installation
Using NVPL with Conda - Conda packages and environment setup
Building NumPy and SciPy with NVPL - Building custom NumPy and SciPy wheels with NVPL support

OpenMP Support#

All libraries support the following OpenMP runtime libraries. See each library’s documentation for details and API extensions supporting nested parallelism.

GCC: libgomp.so
Clang: libomp.so
NVHPC: libnvomp.so

Warning

NVPL libraries do not explicitly link any particular OpenMP runtime; they rely on runtime loading of the OpenMP library as determined by the application and environment. Applications linked to NVPL should use the same OpenMP runtime at runtime that the application was compiled with. Mixing OpenMP runtimes between compile time and runtime may cause incorrect behavior or degraded performance.

Warning

NVIDIA HPC modules provide a libgomp.so symlink to libnvomp.so. This symlink will be on LD_LIBRARY_PATH if NVHPC environment modules are loaded. Use ldd to ensure that applications built with GCC do not accidentally load the libgomp.so symlink from the HPC SDK because of LD_LIBRARY_PATH precedence. Use libnvomp.so only if the application was built with NVHPC compilers.

MPI Support#

NVPL provides standard BLACS interfaces for the following MPI distributions. See the NVPL ScaLAPACK Documentation for details.

MPICH : >=mpich-4.0 runtime supported.
OpenMPI-3.x
OpenMPI-4.x
OpenMPI-5.x
NVIDIA HPC-X via OpenMPI-4 interface