cusolverMp: A High-Performance CUDA Library for Distributed Dense Linear Algebra¶
NVIDIA cusolverMp is a high-performance, distributed-memory, GPU-accelerated library that provides tools for the solution of dense linear systems and eigenvalue problems.
cusolverMp is compatible with 2D block-cyclic data layout and provides ScaLAPACK-like C APIs.
A companion library, CAL, contains utilities to manage communicators and to synchronize processes in a safe way.
Download: cusolverMp library is available through NVIDIA HPC SDK
Key Features¶
- Multi-process, multi-GPU.
- One process per GPU.
- ScaLAPACK-like C functionalities and interfaces to facilitate porting.
- Configurable communication backends (NCCL, MPI, UCX)
- Logging and tracing.
- Tensor-core accelerated.
Support¶
- Supported SM Architectures:
SM 8.0,SM 8.6 - Supported OSes:
Linux - Supported CPU Architectures:
x86_64,pp64 - Supported MPI Libraries:
OpenMPI (shipped with HPC-SDK),SpectrumMPI 11.x
Prerequisites¶
- HPC-SDK.
- Dependencies:
cudart,nvrtc,cublas.h,cusolverDn.h,cal.hheaders.libcal.soandcusolverMp.sobinaries.