Release Notes¶
cuSPARSELt v0.0.1¶
Initial release
Support
Linux x86_64andSM 8.0-
Provide the following mixed-precision computation kernels:
FP16inputs/output,FP32Tensor Core accumulateBFLOAT16inputs/output,FP32Tensor Core accumulateINT8inputs/output,INT32Tensor Core compute
Compatibility notes:
cuSPARSELt requires CUDA 11.0 or above
cuSPARSELt v0.1.0¶
Added support for
Windows x86-64andLinux Arm64platformsIntroduced
SM 8.6compatibility-
Added new kernels:
FP32inputs/output,TF32Tensor Core computeTF32inputs/output,TF32Tensor Core compute
Better performance for
SM 8.0kernels (up to 90% SOL)New APIs for compression and pruning decoupled from
cusparseLtMatmulPlan_t
Compatibility notes:
cuSPARSELt requires CUDA 11.2 or above
cusparseLtMatDescriptor_tmust be destroyed withcusparseLtMatDescriptorDestroyfunctionBoth static and shared libraries must be linked with the
nvrtclibraryOn Linux systems, both static and shared libraries must be linked with the
dllibrary
Resolved issues:
CUSPARSELT_MATMUL_SEARCH_ITERATIONSis now handled correctly