cuDensityMat: A High-Performance Library for Analog Quantum Dynamics Computations

Welcome to the cuDensityMat library documentation!

NVIDIA cuDensityMat is a high-performance library for accelerating analog quantum dynamics solvers, a component of the NVIDIA cuQuantum SDK. Functionalities of cuDensityMat are described in Overview with the installation and usage guide provided in Getting Started.

Key Features

  • Provides APIs for:

    • Defining arbitrary, possibly batched pure/mixed quantum states in arbitrary tensor-product spaces (any number and dimensions of quantum degrees of freedom).

    • Defining arbitrary, possibly batched quantum many-body operators and superoperators in the form of a sum of tensor products of elementary tensor operators, full matrix operators, and scalar coefficients. Each elementary tensor operator acts on one or more specific quantum degrees of freedom whereas each full matrix operator acts on all quantum degrees of freedom (full matrix operators, elementary tensor operators, and associated scalar coefficients may depend on time and user-provided real parameters).

    • Computing the action of a many-body operator/superoperator on a quantum state.

    • Computing the vector-jacobian product (VJP) and gradients of the many-body operator/superoperator action on a quantum state with respect to the user-provided real parameters parameterizing elementary tensor operators, matrix operators, and scalar coefficients inside the operator/superoperator.

    • Composing and computing the r.h.s. of a desired quantum dynamics master equation (ordinary differential equation) or a system of coupled master equations.

    • Computing properties of quantum states such as expectation values.

Support

  • Supported GPU Architectures: Volta, Turing, Ampere, Ada, Hopper, Blackwell

  • Supported OS: Linux

  • Supported CPU Architectures: x86_64, ARM64

Prerequisites

Known Issues and Limitations

  • Computation of gradients via cudensitymatOperatorComputeActionBackwardDiff() currently does not support multidiagonal elementary tensor operators, batched operators and/or states, as well as distributed execution.

  • The tensor callback feature currently requires CUDA stream synchronization between successive operator action compute calls.

  • While cudensitymatComputeType_t is exposed as a parameter in some APIs, currently the actual compute type is solely inferred from the data type.

  • The number of parallel processes (equivalently, number of GPUs) in parallel distributed runs must be a power of two, one GPU per parallel process.