Overview¶
This section describes the basic working principle of the cuStateVec Ex API. For a general introduction to quantum circuits, please refer to Introduction to quantum computing.
Introduction¶
cuStateVec Ex API is a new set of APIs added to the cuStateVec library for easy and quick development of state vector simulators. Building upon the foundational cuStateVec library, cuStateVec Ex APIs provide enhanced capabilities for quantum circuit simulations with advanced performance capabilities, flexibility, and resource management.
Key Features¶
cuStateVec Ex provides the following key capabilities:
- State Vector
Manages GPU memory and other computing resources
Advanced capabilities for managing qubit ordering and its permutation
State transfer between host and device memory
Interoperability that exposes internal computing resources, allowing users to reuse computation results.
- State Vector Operations
Gate application supporting dense, diagonal, and anti-diagonal matrices and Pauli rotations
Multiple single-qubit measurements in a single call with flexible collapse operations
Sampling with configurable output ordering
Probability calculations with masking support
Expectation value computation for Pauli strings
- State Vector Updater
Encapsulates simulator pipeline to accelerate state vector updates
Queue-and-execute framework designed for reuse across multiple state vector updates
Gate matrix fusion to accelerate gate application
Support for mixed unitary channels and general quantum channels
Wire¶
In cuStateVec Ex, a wire is the logical entity used to represent qubits within quantum circuit simulations. This terminology originates from quantum circuit diagrams, where qubits are represented as horizontal lines (wires) that quantum gates act upon.
An index bit is the fundamental addressing mechanism used in cuStateVec to represent qubit. In state vector, the index bits of complex array (that is state vector) represent the qubits as a classical entity.
Each wire is mapped to an index bit. The array of that mapping is called wire ordering. State vector is a high-order tensor built upon power-of-two modes. Wire ordering specifies the tensor mode ordering of state vector. The number of wires equals the number of index bits in the state vector object.
A wire specifies a qubit in simulations. Throughout the cuStateVec Ex API, operations such as gate application, measurements, and permutations are specified in terms of wires such as target wires and control wires.
Key Components¶
State Vector¶
cuStateVec Ex API defines custatevecExStateVectorDescriptor_t
to represent state vector object that provides comprehensive control over GPU memory and computational resources. Unlike the direct device pointer approach of cuStateVec, state vectors are represented as descriptor objects that encapsulate all necessary metadata and resource information.
Wire ordering and permutation capabilities allow dynamic control over the tensor mode structure of the state vector for optimized state vector data organization. Users can reassign wire orderings and apply scatter or gather permutations through custatevecExPermuteIndexBits
to optimize data layout for specific computational patterns. This tensor-mode management is essential for efficient quantum circuit simulations. The following figure illustrates wire ordering permutation and the corresponding rearrangement on state vector elements.

Figure. Index bit permutation example using scatter permutation {1, 2, 0} on a 3-wire(qubit) system. The left side shows the wire ordering transformation, while the right side illustrates how state vector elements are correspondingly rearranged by custatevecExPermuteIndexBits
.
The first wire in the wire ordering is the LSB of the index bits.¶
For basic data access, custatevecExStateVectorGetState
and custatevecExStateVectorSetState
provide the easy way to transfer state vector data between host and device memory.
For advanced use cases, the interoperability APIs expose internal computing resources through custatevecExGetResourcesFromDeviceSubSV
and custatevecExGetResourcesFromDeviceSubSVView
, allowing users to access underlying device pointers, CUDA streams, and cuStateVec handles. This advanced technique enables seamless integration with existing cuStateVec workflows and external computational libraries.
State Vector Operations¶
The cuStateVec Ex API set provides simulation primitives frequently seen in state vector simulations. Each cuStateVec Ex API is defined to have the identical feature of the corresponding cuStateVec API. API arguments are also placed in the same or very similar order.
For example, custatevecExAbs2SumArray
computes probability arrays and is identical to custatevecAbs2SumArray
. The first argument is the state vector descriptor in cuStateVec Ex API, while cuStateVec API requires four separate arguments (handle, sv, dataType, nIndexBits) that the stateVector descriptor encapsulates. The remaining arguments preserve their original roles and order.
// cuStateVec Ex API
custatevecStatus_t custatevecExAbs2SumArray(
stateVector,
abs2sum, outputOrdering, outputOrderingLen,
maskBitString, maskWireOrdering, maskLen);
// cuStateVec API
custatevecStatus_t custatevecAbs2SumArray(
handle, sv, svDataType, nIndexBits,
abs2sum, bitOrdering, bitOrderingLen,
maskBitString, maskOrdering, maskLen);
Similarly, custatevecExApplyMatrix
has a simple API signature to pack some variables to the state vector argument. It also extends gate application beyond the original API by supporting three distinct matrix types—dense, diagonal, and anti-diagonal—with optimized execution paths for each mathematical structure. Additionally, custatevecExMeasure
provides an option to automatically reset the measured wire to the zero state after measurement, offering enhanced control over quantum state collapse operations.
Other essential operations include sampling through custatevecExSample
, which executes sampling in a single call. Expectation values for multiple Pauli strings are computed by custatevecExComputeExpectationOnPauliBasis
.
State Vector Updater¶
The State Vector Updater (SVUpdater) encapsulates the simulation pipeline for accelerated state vector updates. This component implements a queue for operators in a circuit, then applies the queued operators to the state vector. The primary features of SVUpdater are gate fusion and noise channel application. Gate fusion fuses gate matrices of queued gates, then applies the fused result to the specified state vector in a single scan. Noise channels, mixed unitary and general channels, are stochastically applied using user-provided random numbers. When a channel needs to apply a matrix, it is efficiently fused in the same path as gate fusion.

Figure. SVUpdater workflow showing the two-step process: (1) Enqueue operators - gate matrices, mixed unitary channels, and general channels are queued into the SVUpdater; (2) Apply to state vector - the queued operations are applied to update the state vector with optimized fusion and execution.¶
The queuing system supports three types of operations: unitary gate matrix applications through custatevecExSVUpdaterEnqueueMatrix
, mixed unitary channels through custatevecExSVUpdaterEnqueueUnitaryChannel
, and general quantum channels using Kraus operators through custatevecExSVUpdaterEnqueueGeneralChannel
. Each operation type can be queued with specific parameters.
Queued operators are applied to a specified state vector instance by calling custatevecExSVUpdaterApply
. This API accepts user-generated random numbers for noise channel application. The number of required random numbers is retrieved by custatevecExSVUpdaterGetMaxNumRequiredRandnums
.
This design enables reuse of queued operators across multiple state vector instances. Since random number sequences are user-generated, different noise effects can be simulated through successive custatevecExSVUpdaterApply
calls, each with different sets of random numbers.
The noise channels are integrated into the SVUpdater pipeline with performance considerations. Mixed unitary channels are randomly sampled and appropriately fused with other operations. For general quantum channels, Kraus operators are sampled considering the probability lower bound computed from eigenvalues, incorporating a delayed expectation value computation as an optimization for low-noise simulations. This optimization leverages the advanced techniques described in arXiv:2111.02396.
The SVUpdater provides reasonable defaults while allowing users to control fusion parameters and other optimization settings through the custatevecExConfigureSVUpdater
API for specific performance requirements.
Compute resource management¶
cuStateVec Ex automatically manages the additional compute resources required to execute APIs, simplifying the simulator development by handling workspace allocation and other resource management internally.
Samples¶
Several sample programs are provided to demonstrate key features and usage patterns of cuStateVec Ex APIs. These sample files are available at NVIDIA cuQuantum github repository:
- quantum_state_initialization.cpp
Demonstrates how to initialize a state vector with external user data. The workflow shows setting quantum state from user data, reassigning wire ordering to match the imported data layout, then permuting the wire ordering to revert it to the default, which resets the memory layout of the state vector.
- index_bit_permutation.cpp
Illustrates wire management and index bit permutation capabilities. Creates random permutations and applies them to the state vector, then verifies that state vector elements are correctly rearranged to match the updated wire ordering.
- pauli_functions.cpp
Shows usage of Pauli rotation and Pauli expectation value computation. Applies rotation operations with
custatevecExApplyPauliRotation
and computes expectation values usingcustatevecExComputeExpectationOnPauliBasis
for multiple Pauli strings.- estimate_pi.cpp
Implements a quantum phase estimation algorithm to estimate the value of π. This complete quantum algorithm example demonstrates advanced gate operations and measurement techniques using cuStateVec Ex APIs, following the approach described in the Qiskit textbook.
- noise_channel.cpp
Demonstrates the SVUpdater workflow for noise simulation. Shows how to queue operators once and apply them repeatedly with different random numbers to study noise effects on GHZ entangled states across different system sizes.