cuSOLVERMp: A High-Performance CUDA Library for Distributed Dense Linear Algebra#

NVIDIA cuSOLVERMp is a high-performance, distributed-memory, GPU-accelerated library that provides tools for the solution of dense linear systems and eigenvalue problems.

cuSOLVERMp is compatible with 2D block-cyclic data layout and provides ScaLAPACK-like C APIs.

Download: cuSOLVERMp library is available through NVIDIA Developer Zone, NVIDIA HPC SDK, PyPI (CUDA 12, CUDA 13), and conda.

Key Features#

  • Multi-process, multi-GPU.

  • One process per GPU.

  • ScaLAPACK-like C functionalities and interfaces to facilitate porting.

  • NCCL communication backend.

  • Logging and tracing.

  • Tensor-core accelerated.

Index#