NVIDIA cuSolverDx Documentation#
The cuSolver Device Extensions (cuSolverDx) library enables some matrix factorization, linear solver and eigenvalue solver routines from cuSolver and cuSparse to be executed inside CUDA kernels. Fusing these operations with other computations can decrease latency and improve overall performance of your application.
cuSolverDx is a part of the MathDx package which also includes cuBLASDx for basic linear algebra subroutines (BLAS), cuFFTDx for FFT calculations, cuRANDDx for random number generation, and nvCOMPDx for datta compression and decompression. All the device extension libraries are designed to work together. When using multiple device extension libraries in a single project, they should all come from the same MathDx release.
Highlights#
The cuSolverDx library currently provides:
High performance batched Cholesky, LU, and QR factorizations, tridiagonal/triangular/full linear system solve, least squares solve, eigenvalue solver, and singular value decomposition functions that can be embedded into a CUDA kernel.
Customizable options to compose these functions for different use cases, including size, precision, type, fill mode, storage layout, and targeted CUDA architecture, etc.
Ability to fuse cuSolverDx functions with other operations in order to save global memory trips.