################################################################################### cuSPARSELt: A High-Performance CUDA Library for Sparse Matrix-Matrix Multiplication ################################################################################### **NVIDIA cuSPARSELt** is a high-performance CUDA library dedicated to general matrix-matrix operations in which at least one operand is a sparse matrix: .. math:: D = Activation(\alpha op(A) \cdot op(B) + \beta op(C) + bias) \cdot scale where :math:`op(A)/op(B)` refers to in-place operations such as transpose/non-transpose, and :math:`alpha, beta, scale` are scalars. The *cuSPARSELt APIs* allow flexibility in the algorithm/operation selection, epilogue, and matrix characteristics, including memory layout, alignment, and data types. **Download:** `developer.nvidia.com/cusparselt/downloads `_ **Provide Feedback:** `Math-Libs-Feedback@nvidia.com `_ **Examples**: `cuSPARSELt Example 1 `_, `cuSPARSELt Example 2 `_ **Blog post**: `Exploiting NVIDIA Ampere Structured Sparsity with cuSPARSELt `_ ================================================================================ Key Features ================================================================================ * *NVIDIA Sparse MMA tensor core* support * Mixed-precision computation support: * `FP16` input/output, `FP32` Tensor Core accumulate * `BFLOAT16` input/output, `FP32` Tensor Core accumulate * `INT8` input/output, `INT32` Tensor Core compute * `INT8` input, `FP16` output, `INT32` Tensor Core compute * `FP32` input/output, `TF32` Tensor Core compute * `TF32` input/output, `TF32` Tensor Core compute * Matrix pruning and compression functionalities * Activation functions, bias vector, and output scaling * Batched computation (multiple matrices in a single run) * GEMM Split-K mode * Auto-tuning functionality (see :ref:`cusparseLtMatmulSearch() `) * NVTX ranging and Logging functionalities ================================================================================ Support ================================================================================ * *Supported SM Architectures*: `SM 8.0`, `SM 8.6`, `SM 8.9` * *Supported OSes*: `Linux`, `Windows` * *Supported CPU Architectures*: `x86_64`, `Arm64` ================================================================================ Index ================================================================================ .. toctree:: :maxdepth: 2 release_notes getting_started types functions logging license