Release Notes#

This section includes significant changes, new features, performance improvements, and known issues. Unless noted otherwise, listed issues should not impact functionality. When functionality is impacted, we provide a workaround to avoid the issue (if available).

0.1.0#

The first early access (EA) release of the nvCOMPDx library.

New Features#

  • Added warp- and block-level APIs for compression and decompression with the LZ4 and ANS algorithms.

  • Added LZ4 compressor support for uint8, uint16, and uint32 data types (compatible with nvCOMP).

  • Added ANS compressor support for uint8 and (b)float16 data types (compatible with nvCOMP).

  • Added ANS strong-scaling support for both compression and decompression.

  • Added interoperatibility with nvCOMP: chunks compressed by nvCOMPDx can be decompressed by nvCOMP, and vice versa.

  • Added flexible thread block size support for both compression and decompression; users can use arbitrary block sizes when embedding nvCOMPDx.

  • Introduced global scratch, shared scratch, input, and output alignment requirements for both compression and decompression.

  • Support for SM70 (Volta) through SM120 (Blackwell) CUDA architectures:
    • Support for SM72 has been deprecated.

    • Support for SM87, SM103, and SM121 is experimental.

  • Added support for NVRTC + nvJitLink use cases for runtime kernel compilation and linking.

  • Added complete examples demonstrating the usage of the library.