Release Notes#

This section includes significant changes, new features, performance improvements, and known issues. Unless noted otherwise, listed issues should not impact functionality. When functionality is impacted, we provide a workaround to avoid the issue (if available).

0.1.0#

The first early access (EA) release of the nvCOMPDx library.

New Features#

Added warp- and block-level APIs for compression and decompression with the LZ4 and ANS algorithms.
Added LZ4 compressor support for uint8, uint16, and uint32 data types (compatible with nvCOMP).
Added ANS compressor support for uint8 and (b)float16 data types (compatible with nvCOMP).
Added ANS strong-scaling support for both compression and decompression.
Added interoperatibility with nvCOMP: chunks compressed by nvCOMPDx can be decompressed by nvCOMP, and vice versa.
Added flexible thread block size support for both compression and decompression; users can use arbitrary block sizes when embedding nvCOMPDx.
Introduced global scratch, shared scratch, input, and output alignment requirements for both compression and decompression.
Support for SM70 (Volta) through SM120 (Blackwell) CUDA architectures:
- Support for SM72 has been deprecated.
- Support for SM87, SM103, and SM121 is experimental.
Added support for NVRTC + nvJitLink use cases for runtime kernel compilation and linking.
Added complete examples demonstrating the usage of the library.