Release Notes¶

cuDensityMat v0.2.0¶

Support vector-jacobian product (VJP) computation (backward differentiation) for non-batched single-GPU execution with dense operators only (dense elementary operators and dense matrix operators).
Support new backward-differentiation gradient callbacks for VJP computation.
Support single-mode multidiagonal elementary operators with arbitrary mode extent and up to 256 non-zero diagonals.
Support Volta, Turing, Ampere, Ada, Hopper, and Blackwell NVIDIA GPU architectures (compute capability 7.0+)

Compatibility notes:

Bugs fixed:

Fixed multidiagonal elementary operators of mode dimension 2 causing illegal memory access.

Support full matrix operators (operator matrices defined in the full composite space)
Support batched operators (both elementary tensor operators and full matrix operators)
Support both CPU-side and GPU-side tensor/scalar callbacks
Support Volta, Turing, Ampere, Ada, Hopper, and Blackwell NVIDIA GPU architectures (compute capability 7.0+)

Compatibility notes:

cuDensityMat requires CUDA 12 or above
cuDensityMat requires cuTENSOR 2.2.0 or above
cuDensityMat supports NVIDIA HPC SDK 21.7 or above
Tensor and scalar callback function signatures have changed to support operator batching

Known issues:

Activating cuDensityMat logging by setting environment variable CUDENSITYMAT_LOG_LEVEL while using GPU-side tensor/scalar callbacks will result in a crash
Multidiagonal elementary operators of mode dimension 2 can cause an illegal memory access. As a workaround, please use dense elementary operators for any elementary operator with mode dimension 2 instead. Multidiagonal elementary operators for mode dimensions larger than 2 are not affected by this issue.

Initial release
Single-GPU and multi-GPU/multi-node capabilities (requires MPI)
Support Linux x86_64 and Linux Arm64 targets
Support Volta, Turing, Ampere, Ada and Hopper NVIDIA GPU architectures (compute capability 7.0+)

Compatibility notes: