Python API#

nvCOMP Python API reference

This is the Python API reference for the NVIDIA® nvCOMP library.

BitstreamKind#

class nvidia.nvcomp.BitstreamKind#

Defines how buffer will be compressed in nvcomp

Members:

NVCOMP_NATIVE : Each input buffer is chunked according to manager setting and compressed in parallel. Allows computation of checksums. Adds custom header with nvCOMP metadata at the beginning of the compressed data.

RAW : Compresses input data as is, just using underlying compression algorithm. Does not add header with nvCOMP metadata.

WITH_UNCOMPRESSED_SIZE : Similar to RAW, but adds custom header with just uncompressed size at the beginning of the compressed data.

Codec#

class nvidia.nvcomp.Codec#
__init__(self: nvidia.nvcomp.nvcomp_impl.Codec, **kwargs) None#

Initialize codec.

Parameters:
  • algorithm – An optional name of compression algorithm to use. By default it is empty and algorithm can be deducted during decoding.

  • device_id – An optional device id to execute decoding/encoding on. If not specified default device will be used.

  • cuda_stream – An optional cudaStream_t represented as a Python integer. By default internal cuda stream is created for given device id.

  • uncomp_chunk_size – An optional uncompressed data chunk size. By default it is 65536.

  • checksum_policy

    Defines strategy for computing and verification of checksum. By default NO_COMPUTE_NO_VERIFY is assumed.

    LZ4 algorithm specific options:

    data_type: An optional array-protocol type string for default data type to use

    GDeflate algorithm specific options:
    algorithm_type: Compression algorithm type to use. Permitted values are:
    • 0 : highest-throughput, entropy-only compression (use for symmetric compression/decompression performance)

    • 1 : high-throughput, low compression ratio (default)

    • 2 : medium-throughput, medium compression ratio, beat Zlib level 1 on the compression ratio

    • 3 : placeholder for further compression level support, will fall into MEDIUM_COMPRESSION at this point

    • 4 : lower-throughput, higher compression ratio, beat Zlib level 6 on the compression ratio

    • 5 : lowest-throughput, highest compression ratio

    Deflate algorithm specific options:
    algorithm_type: Compression algorithm type to use. Permitted values are:
    • 0 : highest-throughput, entropy-only compression (use for symmetric compression/decompression performance)

    • 1 : high-throughput, low compression ratio (default)

    • 2 : medium-throughput, medium compression ratio, beat Zlib level 1 on the compression ratio

    • 3 : placeholder for further compression level support, will fall into MEDIUM_COMPRESSION at this point

    • 4 : lower-throughput, higher compression ratio, beat Zlib level 6 on the compression ratio

    • 5 : lowest-throughput, highest compression ratio

    Bitcomp algorithm specific options:
    algorithm_type: The type of Bitcomp algorithm used.
    • 0 : Default algorithm, usually gives the best compression ratios

    • 1 : “Sparse” algorithm, works well on sparse data (with lots of zeroes). and is usually a faster than the default algorithm.

    data_type: An optional array-protocol type string for default data type to use

    ANS algorithm specific options:
    data_type: An optional array-protocol type string for default data type to use. Permitted values are:
    • |u1 : For unsigned 8 bits integer

    • f16 : For 16 bits float. Requires uncomp_chunk_size to be multiple of 2

    Cascaded algorithm specific options:

    data_type: An optional array-protocol type string for default data type to use

    num_rles: The number of Run Length Encodings to perform. By default equal to 2

    num_deltas: The number of Delta Encodings to perform. By default equal to 1

    use_bitpack: Whether or not to bitpack the final layers. By default it is True.

decode(*args, **kwargs)#

Overloaded function.

  1. decode(self: nvidia.nvcomp.nvcomp_impl.Codec, src: nvidia.nvcomp.nvcomp_impl.Array, data_type: str = ‘’) -> object

    Executes decoding of data from a Array handle.

    Args:

    src: Decode source object.

    data_type: An optional array-protocol type string for output data type. By default it is equal to |u1

    Returns:

    nvcomp.Array

  2. decode(self: nvidia.nvcomp.nvcomp_impl.Codec, srcs: list[nvidia.nvcomp.nvcomp_impl.Array], data_type: str = ‘’) -> list[object]

    Executes decoding from a batch of Array handles.

    Args:

    srcs: List of Array objects

    data_type: An optional array-protocol type string for output data type.

    Returns:

    List of decoded nvcomp.Array’s

encode(*args, **kwargs)#

Overloaded function.

  1. encode(self: nvidia.nvcomp.nvcomp_impl.Codec, array_s: nvidia.nvcomp.nvcomp_impl.Array) -> object

    Encode array.

    Args:

    array: Array to encode

    Returns:

    Encoded nvcomp.Array

  2. encode(self: nvidia.nvcomp.nvcomp_impl.Codec, srcs: list[nvidia.nvcomp.nvcomp_impl.Array]) -> list[object]

    Executes encoding from a batch of Array handles.

    Args:

    srcs: List of Array objects

    Returns:

    List of encoded nvcomp.Array’s

ArrayBufferKind#

class nvidia.nvcomp.ArrayBufferKind#

Defines buffer kind in which array data is stored.

Members:

STRIDED_DEVICE : GPU-accessible in pitch-linear layout.

STRIDED_HOST : Host-accessible in pitch-linear layout.

Array#

class nvidia.nvcomp.Array#

Class which wraps array. It can be decoded data or data to encode.

property __cuda_array_interface__#

The CUDA array interchange interface compatible with Numba v0.39.0 or later (see CUDA Array Interface for details)

__dlpack__(self: nvidia.nvcomp.nvcomp_impl.Array, stream: object = None) capsule#

Export the array as a DLPack tensor

__dlpack_device__(self: nvidia.nvcomp.nvcomp_impl.Array) tuple#

Get the device associated with the buffer

property buffer_kind#

Buffer kind in which array data is stored.

property buffer_size#

The total number of bytes to store the array.

cpu(self: nvidia.nvcomp.nvcomp_impl.Array) object#

Returns a copy of this array in CPU memory. If this array is already in CPU memory, than no copy is performed and the original object is returned.

Returns:

Array object with content in CPU memory or None if copy could not be done.

cuda(self: nvidia.nvcomp.nvcomp_impl.Array, synchronize: bool = True, cuda_stream: int = 0) object#

Returns a copy of this array in device memory. If this array is already in device memory, than no copy is performed and the original object is returned.

Parameters:
  • synchronize – If True (by default) it blocks and waits for copy from host to device to be finished, else not synchronization is executed and further synchronization needs to be done using cuda stream provided by e.g. __cuda_array_interface__.

  • cuda_stream – An optional cudaStream_t represented as a Python integer to copy host buffer to.

Returns:

Array object with content in device memory or None if copy could not be done.

property dtype#
property item_size#

Size of each element in bytes.

property ndim#
property precision#

Maximum number of significant bits in data type. Value 0 means that precision is equal to data type bit depth

property shape#
property size#

Number of elements this array holds.

property strides#

Strides of axes in bytes

to_dlpack(self: nvidia.nvcomp.nvcomp_impl.Array, cuda_stream: object = None) capsule#

Export the array with zero-copy conversion to a DLPack tensor.

Parameters:

cuda_stream – An optional cudaStream_t represented as a Python integer, upon which synchronization must take place in created Array.

Returns:

DLPack tensor which is encapsulated in a PyCapsule object.

as_array#

nvidia.nvcomp.as_array(source: object, cuda_stream: int = 0) nvidia.nvcomp.nvcomp_impl.Array#

Wraps an external buffer as an array and ties the buffer lifetime to the array

Parameters:
  • source – Input DLPack tensor which is encapsulated in a PyCapsule object or other object with __cuda_array_interface__, __array_interface__ or __dlpack__ and __dlpack_device__ methods.

  • cuda_stream – An optional cudaStream_t represented as a Python integer, upon which synchronization must take place in the created Array.

Returns:

nvcomp.Array

as_arrays#

nvidia.nvcomp.as_arrays(sources: list[object], cuda_stream: int = 0) list[object]#

Wraps all an external buffers as an arrays and ties the buffers lifetime to the arrays

Parameters:
  • sources – List of input DLPack tensors which is encapsulated in a PyCapsule objects or other objects with __cuda_array_interface__, __array_interface__ or __dlpack__ and __dlpack_device__ methods.

  • cuda_stream – An optional cudaStream_t represented as a Python integer, upon which synchronization must take place in created Array.

Returns:

List of nvcomp.Array’s

from_dlpack#

nvidia.nvcomp.from_dlpack(source: object, cuda_stream: int = 0) nvidia.nvcomp.nvcomp_impl.Array#

Zero-copy conversion from a DLPack tensor to a array.

Parameters:
  • source – Input DLPack tensor which is encapsulated in a PyCapsule object or other (array) object with __dlpack__ and __dlpack_device__ methods.

  • cuda_stream – An optional cudaStream_t represented as a Python integer, upon which synchronization must take place in created Array.

Returns:

nvcomp.Array