NVIDIA cuDSS (Preview): A high-performance CUDA Library for Direct Sparse Solvers¶
NVIDIA cuDSS (Preview) is a library of GPU-accelerated linear solvers with sparse matrices. It provides algorithms for solving linear systems of the following type:
with a sparse matrix \(A\), right-hand side \(B\) and unknown solution \(X\) (could be a matrix or a vector).
The cuDSS functionality allows flexibility in matrix properties and solver configuration, as well as execution parameters like CUDA streams.
Note: Since the library is released as a preview, API is subject to change in later releases.
Download: developer.nvidia.com/cudss-downloads
Provide Feedback: cuDSS-EXTERNAL-Group@nvidia.com
Examples: cuDSS Example 1, cuDSS Example 2, cuDSS Example 3
Key Features and Properties¶
Real/complex general/symmetric/positive-definite sparse matrices
Single and double precision datatype for values and
int
datatype for indicesSingle and multiple right-hand sides
Multi-stage execution with three main phases: reordering & symbolic factorization, numerical factorization and solving
Different algorithms for reordering and factorization phases
Refactorization
Iterative refinement
User-defined device memory handlers and memory pools
Hybrid host/device memory mode/algorithm
Multi-GPU multi-node (MGMN) execution
Synchronous API
Non-deterministic computations
Support¶
Supported configurations: single GPU, multi-GPU multi-node (MGMN)
Supported SM Architectures: all
SM
starting with Pascal (SM_87
only for aarch64/Jetson build)Supported OSes:
Linux
,Windows
Supported CPU Architectures:
x86_64
,ARM (SBSA)
,ARM (aarch64/Jetson) for Orin devices