cuDensityMat: A High-Performance Library for Analog Quantum Dynamics Computations¶

Welcome to the cuDensityMat library documentation!

NVIDIA cuDensityMat is a high-performance library for accelerating analog quantum dynamics solvers, a component of the NVIDIA cuQuantum SDK. Functionalities of cuDensityMat are described in Overview with the installation and usage guide provided in Getting Started.

Key Features

Provides APIs for:
- Defining arbitrary, possibly batched pure/mixed quantum states in arbitrary tensor-product spaces (any number and dimensions of quantum degrees of freedom).
- Defining arbitrary, possibly batched quantum many-body operators and superoperators in the form of a sum of tensor products of elementary tensor operators, full matrix operators, and scalar coefficients. Each elementary tensor operator acts on one or more specific quantum degrees of freedom whereas each full matrix operator acts on all quantum degrees of freedom (full matrix operators, elementary tensor operators, and associated scalar coefficients may depend on time and user-provided real parameters).
- Computing the action of a many-body operator/superoperator on a quantum state.
- Computing the vector-jacobian product (VJP) and gradients of the many-body operator/superoperator action on a quantum state with respect to the user-provided real parameters parameterizing elementary tensor operators, matrix operators, and scalar coefficients inside the operator/superoperator.
- Composing and computing the r.h.s. of a desired quantum dynamics master equation (ordinary differential equation) or a system of coupled master equations (system of ordinary differential equations).
- Computing properties of operators and quantum states such as expectation values.
- Computing the extreme eigenspectrum of an operator.

Support

Supported GPU Architectures: Turing, Ampere, Ada, Hopper, Blackwell
Supported OS: Linux
Supported CPU Architectures: x86_64, ARM64

Prerequisites

One of the following CUDA Toolkits and a compatible driver are required:

CUDA Toolkit

Minimum Required Linux Driver Version

CUDA® 12.x

>= 525.60.13

CUDA® 13.x

>= 580.65.06

Please refer to CUDA Toolkit Release Notes for the details.
cuTENSOR v2.3.1
cuTensorNet v2.9.0

Known Issues and Limitations

Computation of gradients via cudensitymatOperatorComputeActionBackwardDiff() currently does not support multi-diagonal elementary tensor operators, batched operators and/or quantum states, as well as distributed execution.
Computation of the extreme eigenspectrum of an operator is currently only supported for non-batched Hermitian operators and pure non-batched quantum states.
While cudensitymatComputeType_t is exposed as a parameter in some APIs, currently the actual compute type is solely inferred from the data type.
The tensor callback feature currently requires CUDA stream synchronization between successive operator action compute calls.
The use of Host-side CPU scalar/tensor callbacks (or gradient) callbacks triggers synchronization of the CUDA stream.
Eigenspectrum computation triggers synchronization of the CUDA stream.
Distributed execution of compute calls triggers synchronization of the CUDA stream on each process.
The number of parallel processes (equivalently, number of GPUs) in parallel distributed runs must be a power of two, one GPU per parallel process.

CUDA Toolkit	Minimum Required Linux Driver Version
CUDA® 12.x	>= 525.60.13
CUDA® 13.x	>= 580.65.06