Release Notes#

cuDensityMat v0.6.0#

New features:

  • Support ground-state eigen-decomposition of a Hermitian operator encoded as a matrix product operator (MPO), acting on a matrix-product state (MPS) with at least two sites, via the split-scope density matrix renormalization group (DMRG) method with a block Krylov subspace local solver. Both the 1-site and the 2-site variants are supported, selected by CUDENSITYMAT_EIGEN_SPLIT_SCOPE_DMRG_NUM_SITES (1 or 2). The 2-site variant solves two adjacent sites jointly and re-splits the result with an SVD truncation, so the intervening bond dimension adapts during the computation up to its buffer maximum; the truncation is configured by attaching an SVD configuration via CUDENSITYMAT_EIGEN_SPLIT_SCOPE_DMRG_SVD_CONFIG (including CUDENSITYMAT_SVD_CONFIG_MAX_EXTENT), and the current per-bond extents are seeded and inspected through cudensitymatStateMPSSetCurrentBondExtents / cudensitymatStateMPSGetCurrentBondExtents. Currently restricted to the smallest-real eigenpair of a single MPO product, open boundary conditions, single-GPU/single-node, non-batched. See the new dmrg_eigensolver_one_site_example.cpp and dmrg_eigensolver_two_site_example.cpp for the canonical end-to-end patterns.

  • Support the action of an operator encoded as a matrix product operator (MPO) on a quantum state in the matrix-product-state (MPS) factorization via the split-scope variational ALS fitting path (single-site variant). The operator must consist of a single MPO product within a single operator term; multi-MPO products and sums of terms are not currently supported. Open boundary conditions only, single-GPU/single-node, non-batched, no gradient support. The new C-API entries are cudensitymatOperatorActionConfigure together with the cudensitymatStateFittingScopeSplitALSConfig and cudensitymatStateFittingApproachLinSolveConfig configuration quartets (Create / Destroy / SetAttribute / GetAttribute). See the new mps_mpo_action_example.cpp for the canonical end-to-end pattern.

  • Support the two-site variant of the split-scope TDVP time propagation for pure matrix-product-state (MPS) quantum states, selected via the CUDENSITYMAT_PROPAGATION_SPLIT_SCOPE_TDVP_NUM_SITES attribute set to 2. The two-site update evolves neighboring site tensors together and re-splits them via a truncated SVD, allowing the MPS bond extents to grow or shrink dynamically. The truncation policy is supplied through an cudensitymatSVDConfig_t object (Create / Destroy / SetAttribute / GetAttribute) attached via CUDENSITYMAT_PROPAGATION_SPLIT_SCOPE_TDVP_SVD_CONFIG, including a global cap on the retained bond extent via CUDENSITYMAT_SVD_CONFIG_MAX_EXTENT. An MPS state’s current (valid) bond extents can now be recorded and queried independently of its maximum (buffer) bond extents via cudensitymatStateMPSSetCurrentBondExtents and cudensitymatStateMPSGetCurrentBondExtents, enabling truncation monitoring. Open boundary conditions only, single-GPU/single-node, non-batched, and double precision only (CUDA_C_64F). See the new mps_tdvp_two_site_example.cpp for the canonical end-to-end pattern.

Bugs fixed:

  • Fixed a functionality bug where re-preparing an existing cudensitymatTimePropagation_t with cudensitymatTimePropagationPrepare did not always pick up changes made since the previous preparation, such as a new configuration set with cudensitymatTimePropagationConfigure, so cudensitymatTimePropagationCompute could keep using the earlier configuration or inputs. Re-preparation now reflects the current configuration and inputs.

Compatibility notes:

  • cuDensityMat requires cuTENSOR v2.5.0 or above. However, cuTENSOR v2.7.0 or above is recommended for improved performance.

  • Starting with the next release, Fedora 42 and KylinOS V10 will no longer be supported. The minimum supported versions will be Fedora 44 and KylinOS V11.

cuDensityMat v0.5.2#

Compatibility notes:

  • CUDENSITYMAT_OPERATOR_SPECTRUM_CONFIG_MAX_RESTARTS now follows the standard restarted-Krylov convention: the value now specifies the maximum number of thick restarts of the block Krylov algorithm, so the total number of Krylov-subspace expansions performed is at most max_restarts + 1 (one initial expansion plus up to max_restarts restarted expansions). A value of 0 is now also accepted and selects a single expansion with no restart. The default value has been changed from 20 to 19; resulting in no changes to the default behaviour.

cuDensityMat v0.5.1#

Bugs fixed:

  • Fixed a correctness bug for OperatorSpectrum computation where the returned eigenvalues were always the smallest by magnitude eigenvalues, regardless of the requested cudensitymatOperatorSpectrumKind_t value.

  • Fixed an incorrectly documented default value for CUDENSITYMAT_PROPAGATION_APPROACH_KRYLOV_ADAPTIVE_STEP_SIZE.

cuDensityMat v0.5.0#

New features:

  • Support time-dependent variational principle (TDVP) propagator for the matrix-product-state (MPS) factorization of the quantum state and the matrix product operator (MPO) representation of the operator, single-site variant only. The operator must consist of a single MPO term; products of multiple MPOs and sums of MPO terms are not currently supported. Multi-GPU/multi-node execution, batching, and gradients are not supported yet.

  • Support integration with JAX XLA/Autograd, including JIT, VJP, and VMAP (with restrictions).

Bugs fixed:

  • Fixed a functional bug in the multi-GPU multi-node operator action computation that was causing failures in certain situations involving odd mode extents.

  • Fixed a functional bug in the multi-GPU multi-node operator action computation that was causing failures in certain situations due to overflow in MPI message sizes.

Improvements:

  • Operator action performance for the single-mode multidiagonal (DIA) elementary tensor operators is improved.

Compatibility notes:

  • cuDensityMat requires cuTENSOR v2.5.0 or above.

  • cuDensityMat requires cuTensorNet-2.12.0 or above.

cuDensityMat v0.4.0#

New features:

  • Enabled support for backward differentiation of the operator action for batched operators and quantum states.

  • Added the NCCL distributed communication interface as an alternative to MPI for multi-GPU/multi-node execution (experimental).

Improvements:

  • Increased support for single-mode multidiagonal (DIA) elementary tensor operators from 256 to 2048 non-zero diagonals.

  • Improved performance of sparse DIA elementary tensor operators acting on a pure or mixed quantum state for Hilbert spaces flattened into a single mode (vector space).

cuDensityMat v0.3.2#

Improvements:

  • Explicitly optimized support of quantum states allocated in unified/managed (CPU+GPU) memory.

  • Improved performance for sparse general DIA elementary tensor operators.

  • Improved performance of extreme eigen-spectrum computation for Hermitian operators.

cuDensityMat v0.3.1#

Bugs fixed:

  • Fixed a convergence issue in extreme eigen-spectrum computation for certain Hermitian operators.

cuDensityMat v0.3.0#

New features:

  • Support extreme eigen-spectrum computation for arbitrary non-batched Hermitian operators.

  • Support integration with JAX XLA jitting and vector-jacobian product (VJP).

Bugs fixed:

  • Fixed a functional bug for certain mixed quantum state calculations in multi-node multi-GPU execution with more than 4 processes where cudensitymatOperatorPrepareAction() may have run indefinitely.

  • Fixed a functional bug in the multi-GPU multi-node expectation value computation that may have run into CUDENSITYMAT_INTERNAL_ERROR for some workspace buffer sizes.

  • Fixed the internal check for the number of parallel processes to be a power of two.

Compatibility notes:

  • cuDensityMat requires cuTENSOR-2.3.1 or above.

  • cuDensityMat requires cuTensorNet-2.9.0 or above.

cuDensityMat v0.2.0#

New features:

  • Support vector-jacobian product (VJP) computation (backward differentiation) for non-batched single-GPU execution with dense operators only (dense elementary tensor operators and dense matrix operators).

  • Support new backward-differentiation gradient callbacks for VJP computation.

  • Support single-mode multidiagonal elementary tensor operators with an arbitrary mode extent and up to 256 non-zero diagonals.

  • Support Volta, Turing, Ampere, Ada, Hopper, and Blackwell NVIDIA GPU architectures (compute capability 7.0+)

Compatibility notes:

  • cuDensityMat requires cuTENSOR 2.2.0 or above

  • cuDensityMat now requires cuTensorNet 2.8.0 or above

  • Removed cudensitymatOperatorTermAppendGeneralProduct C API function

Known issues:

  • The multi-GPU multi-node expectation value computation may run into CUDENSITYMAT_INTERNAL_ERROR for some workspace buffer sizes. The workaround is to set the workspace buffer size to exactly match the value returned by the PrepareExpection call, or use the maximum of the workspace buffer sizes across all MPI ranks for each rank.

Bugs fixed:

  • Fixed multidiagonal elementary tensor operators of mode dimension 2 causing illegal memory access.

cuDensityMat v0.1.0#

  • Support full matrix operators (operator matrices defined in the full composite space)

  • Support batched operators (both elementary tensor operators and full matrix operators)

  • Support both CPU-side and GPU-side tensor/scalar callbacks

  • Support Volta, Turing, Ampere, Ada, Hopper, and Blackwell NVIDIA GPU architectures (compute capability 7.0+)

Compatibility notes:

  • cuDensityMat requires CUDA 12 or above

  • cuDensityMat requires cuTENSOR 2.2.0 or above

  • cuDensityMat supports NVIDIA HPC SDK 21.7 or above

  • Tensor and scalar callback function signatures have changed to support operator batching

Known issues:

  • Activating cuDensityMat logging by setting environment variable CUDENSITYMAT_LOG_LEVEL while using GPU-side tensor/scalar callbacks will result in a crash

  • Multidiagonal elementary operators of mode dimension 2 can cause an illegal memory access. As a workaround, please use dense elementary operators for any elementary operator with mode dimension 2 instead. Multidiagonal elementary operators for mode dimensions larger than 2 are not affected by this issue.

cuDensityMat v0.0.5#

  • Initial release

  • Single-GPU and multi-GPU/multi-node capabilities (requires MPI)

  • Support Linux x86_64 and Linux Arm64 targets

  • Support Volta, Turing, Ampere, Ada and Hopper NVIDIA GPU architectures (compute capability 7.0+)

Compatibility notes:

  • cuDensityMat requires CUDA 11.4 or above

  • cuDensityMat requires cuTENSOR 2.0.2 or above

  • cuDensityMat supports NVIDIA HPC SDK 21.7 or above