cuSPARSELt: A High-Performance CUDA Library for Sparse Matrix-Matrix Multiplication¶
NVIDIA cuSPARSELt is a high-performance CUDA library dedicated to general matrix-matrix operations in which at least one operand is a sparse matrix:
where refers to in-place operations such as transpose/non-transpose
The cuSPARSELt APIs allow flexibility in the algorithm/operation selection, epilogue, and matrix characteristics, including memory layout, alignment, and data types.
NVIDIA Ampere architecture Sparse MMA tensor core support
FP32Tensor Core accumulate
BFLOAT32Tensor Core accumulate
INT32Tensor Core compute
Memory Layouts: row-major, column-major
Matrix pruning and compression functionalities
Auto-tuning functionality (see cusparseLtMatmulSearch())
Supported SM Architectures:
Supported CPU Architectures:
Other platforms will be added in the future releases.