Release Notes#
This section includes significant changes, new features, performance improvements, and known issues. Unless noted otherwise, listed issues should not impact functionality. When functionality is impacted, we provide a workaround to avoid the issue (if available).
0.1.1#
Patch release that adds support for CUDA 13.
New Features#
Support for CUDA 13.0.
Support Thor SM renaming from
sm_101tosm_110starting from CUDA 13.0; for CUDA 12.9 and older releases Thor stays labeled assm_101.
0.1.0#
The first early access (EA) release of the nvCOMPDx library.
New Features#
Added warp- and block-level APIs for compression and decompression with the LZ4 and ANS algorithms.
Added LZ4 compressor support for
uint8,uint16, anduint32data types (compatible with nvCOMP).Added ANS compressor support for
uint8and(b)float16data types (compatible with nvCOMP).Added ANS strong-scaling support for both compression and decompression.
Added interoperatibility with nvCOMP: chunks compressed by nvCOMPDx can be decompressed by nvCOMP, and vice versa.
Added flexible thread block size support for both compression and decompression; users can use arbitrary block sizes when embedding nvCOMPDx.
Introduced global scratch, shared scratch, input, and output alignment requirements for both compression and decompression.
- Support for SM70 (Volta) through SM120 (Blackwell) CUDA architectures:
Support for SM72 has been deprecated.
Support for SM87, SM103, and SM121 is experimental.
Added support for NVRTC + nvJitLink use cases for runtime kernel compilation and linking.
Added complete examples demonstrating the usage of the library.