cusolverMp: A High-Performance CUDA Library for Distributed Dense Linear Algebra¶
NVIDIA cusolverMp is a high-performance, distributed-memory, GPU-accelerated library that provides tools for the solution of dense linear systems and eigenvalue problems.
cusolverMp is compatible with 2D block-cyclic data layout and provides ScaLAPACK-like C APIs.
A companion library, CAL, contains utilities to manage communicators and to synchronize processes in a safe way.
Download: cusolverMp library is available through NVIDIA HPC SDK
Key Features¶
- Multi-process, multi-GPU.
- One process per GPU.
- ScaLAPACK-like C functionalities and interfaces to facilitate porting.
- Configurable communication backends (NCCL, MPI, UCX)
- Logging and tracing.
- Tensor-core accelerated.
Support¶
- Supported SM Architectures:
SM 8.0
,SM 8.6
- Supported OSes:
Linux
- Supported CPU Architectures:
x86_64
,pp64
- Supported MPI Libraries:
OpenMPI (shipped with HPC-SDK)
,SpectrumMPI 11.x
Prerequisites¶
- HPC-SDK.
- Dependencies:
cudart
,nvrtc
,cublas.h
,cusolverDn.h
,cal.h
headers.libcal.so
andcusolverMp.so
binaries.