cusolverMp: A High-Performance CUDA Library for Distributed Dense Linear Algebra¶
NVIDIA cusolverMp is a high-performance, distributed-memory, GPU-accelerated library that provides tools for the solution of dense linear systems and eigenvalue problems.
cusolverMp is compatible with 2D block-cyclic data layout and provides ScaLAPACK-like C APIs.
A companion library, CAL, contains utilities to manage communicators and to synchronize processes in a safe way.
Download: cusolverMp library is available through NVIDIA HPC SDK
- Multi-process, multi-GPU.
- One process per GPU.
- ScaLAPACK-like C functionalities and interfaces to facilitate porting.
- Configurable communication backends (NCCL, MPI, UCX)
- Logging and tracing.
- Tensor-core accelerated.
- Supported SM Architectures:
- Supported OSes:
- Supported CPU Architectures:
- Supported MPI Libraries:
OpenMPI (shipped with HPC-SDK),
- Release Notes
- Getting Started
- cusolverMp Data Types
- cusolverMp Logging
- cusolverMp C API
- Communication abstraction library API and data types
- Software License Agreement