Release Notes¶

cuQuantum Python v25.06.0¶

Add new APIs and functionalities:
- For low-level APIs, please refer to the release notes of cuStateVec v1.9.0, cuTensorNet v2.8.0 and cuDensityMat v0.2.0
- Added new options parameter to cuquantum.tensornet.CircuitToEinsum to allow users to control the einsum expression generation and tensor operand representation of qiskit.QuantumCircuit and cirq.Circuit objects. Users can now specify the following options:
  - cuquantum.tensornet.QiskitParserOptions for qiskit.QuantumCircuit
  - cuquantum.tensornet.CirqParserOptions for cirq.Circuit
  For more details, please refer to the documentation of cuquantum.tensornet.QiskitParserOptions and cuquantum.tensornet.CirqParserOptions.
- Added vector-jacobian product (VJP) computation (backward differentiation) of the action of cuquantum.densitymat.Operators with respect to parameters of its dynamic components. Only enabled for single GPU execution and cuquantum.densitymat.Operators composed of dense elementary or matrix operators without any batched components.
- Support cuquantum.densitymat.MultidiagonalOperator with arbitrary mode extent and up to 256 non-zero diagonals.
Deprecation Status:
- The deprecated APIs mentioned in v25.03.0 (including cuquantum.custatevec, cuquantum.cutensornet, and various high-level APIs) remain functional in this release but will be removed in the next release. Users are strongly encouraged to migrate to the new APIs in cuquantum.tensornet and cuquantum.bindings as soon as possible.

Compatibility notes:

Following NEP-29, cuQuantum Python now requires Python 3.11+. To upgrade from an earlier version, you can download Python from python.org or use a virtual environment manager such as conda virtual environment (recommended) or pyenv.
cuQuantum Python now supports Cython v3.0+.
cuQuantum Python now supports Qiskit v1.4.2+, including compatibility with the Qiskit v2.0 releases.
In the next release, NumPy v1.26+ will be required to follow NEP-29.

Bugs fixed:
- Fixed a known issue with in-place addition of a cuquantum.densitymat.OperatorTerm to a cuquantum.densitymat.Operator via the += operator overload.
- Fixed a bug that ignored scalar values supplied for the static component of batched coefficients in cuquantum.densitymat.
Known issues:
- The VJP computation for cuquantum.densitymat.Operators composed of at least one matrix operators defined with regular callback but without gradient callback will lead to a runtime error.

cuQuantum Python v25.03.0¶

Deprecation Changes:

All low-level APIs for cuStateVec and cuTensorNet are now migrated to cuquantum.bindings under corresponding submodules cuquantum.bindings.custatevec and cuquantum.bindings.cutensornet while the API names and signatures remain unchanged. The original cuquantum.custatevec and cuquantum.cutensornet modules are now deprecated and will be removed in a future release.

A new module cuquantum.tensornet is now released to host all high-level APIs for tensor network computations including the following:

Deprecated	New
`cuquantum.contract`	`cuquantum.tensornet.contract`
`cuquantum.contract_path`	`cuquantum.tensornet.contract_path`
`cuquantum.einsum`	`cuquantum.tensornet.einsum`
`cuquantum.einsum_path`	`cuquantum.tensornet.einsum_path`
`cuquantum.Network`	`cuquantum.tensornet.Network`
`cuquantum.NetworkOptions`	`cuquantum.tensornet.NetworkOptions`
`cuquantum.CircuitToEinsum`	`cuquantum.tensornet.CircuitToEinsum`
`cuquantum.OptimizerInfo`	`cuquantum.tensornet.OptimizerInfo`
`cuquantum.OptimizerOptions`	`cuquantum.tensornet.OptimizerOptions`
`cuquantum.PathFinderOptions`	`cuquantum.tensornet.PathFinderOptions`
`cuquantum.ReconfigOptions`	`cuquantum.tensornet.ReconfigOptions`
`cuquantum.SlicerOptions`	`cuquantum.tensornet.SlicerOptions`
`cuquantum.cutensornet.tensor`	`cuquantum.tensornet.tensor`
`cuquantum.cutensornet.experimental`	`cuquantum.tensornet.experimental`

These high-level pythonic APIs are still accessible directly under cuquantum but the support will be dropped in a future release. Users are encouraged to start the migration to the new APIs as soon as possible.

cuquantum.densitymat module requires callback functions to be passed in form of wrapper classes GPUCallback or CPUCallback instead of directly as a callable.
cuquantum.densitymat callback functions now receive callback arguments as a two-dimensional numpy.ndarray or cupy.ndarray instead of as a Tuple.
cuquantum.bindings.cudensitymat requires callback functions to be passed via wrapper classes WrappedScalarCallback and WrappedTensorCallback available in this submodule.

Add new APIs and functionalities:
- For low-level APIs, please refer to the release notes of cuStateVec v1.8.0, cuTensorNet v2.7.0, and cuDensityMat v0.1.0
- Added a new option cuquantum.tensornet.experimental.MPSConfig.gauge_option to allow users to perform MPS simulation using either the free gauge form or the simple update algorithm.
- Added a new method cuquantum.tensornet.experimental.NetworkState.apply_general_tensor_channel() to allow users to perform noisy simulation of a tensor network state with general tensor channels.
- Added full support for batching of operators and coefficients to cuquantum.densitymat.
- Added support for dense matrix operators defined on the full Hilbert space in cuquantum.densitymat.
- Added support for general single-mode multidiagonal operators to cuquantum.densitymat.
- Added support for GPU-based callbacks, as well as in-place and out-of-place for both CPU and GPU-based callbacks in cuquantum.densitymat.
- cuquantum.densitymat.DenseOperator and cuquantum.densitymat.MultidiagonalOperator now allow copy-less instance creation. The default behaviour is to copy the input array.
Bugs fixed:
- Fix a bug for cuquantum.tensornet.experimental.NetworkState to fail when the state only contains cuquantum.tensornet.experimental.NetworkOperators but no tensor operators.
- Fixed a bug in cuquantum.tensornet.experimental.NetworkState.compute_state_vector() and cuquantum.tensornet.experimental.NetworkState.compute_batched_amplitudes() that caused failures when one was called immediately after the other, with both intended to compute the full state vector.
Known issues:
- In-place addition of a cuquantum.densitymat.OperatorTerm to a cuquantum.densitymat.Operator via the += operator overload is broken. For the same functionality use cuquantum.densitymat.Operator.append() instead.

cuQuantum Python v24.11.0¶

Add new APIs and functionalities:
- A new module cuquantum.bindings to host the low-level APIs for the cuDensityMat library under cuquantum.bindings.cudensitymat.
- For low-level APIs, please refer to the release notes of cuStateVec v1.7.0, cuTensorNet v2.6.0 and cuDensityMat v0.0.5
- A new experimental module cuquantum.densitymat for accelerating analog quantum dynamics solvers based on the quantum many-body operators and density-matrix (or state-vector) formalism,. For detailed introduction on this new module, please refer to Quantum Dynamics APIs. Note: All new experimental APIs may be subject to change in a future release. Kindly share your feedback with us on NVIDIA/cuQuantum GitHub Discussions!
- Added a new parameter release_operators in cuquantum.cutensornet.experimental.NetworkState.compute_output_state() to allow users to choose whether to release the reference of all underlying operators during MPS simulation.
- Added a new method cuquantum.cutensornet.experimental.NetworkState.apply_unitary_tensor_channel() to allow users to perform noisy simulation of a tensor network state. See the Python sample (example05_noisy_unitary_channels.py) for details.
- Added a new class cuquantum.MemoryLimitExceeded to raise an exception when an operation requires more device memory than specified in the operation options. This applies to contraction, decomposition, and contraction-decomposition operations.
Bugs fixed:
- Fix a bug for cuquantum.CircuitToEinsum to support qiskit.QuantumCircuit for qiskit<v1.0.0.
Other changes:
- The compute_norm method has been removed from the experimental API cuquantum.cutensornet.experimental.NetworkState. The squared norm of the state can now be retrieved by providing the argument return_norm=True to the following APIs:
  - cuquantum.cutensornet.experimental.NetworkState.compute_amplitude()
  - cuquantum.cutensornet.experimental.NetworkState.compute_batched_amplitudes()
  - cuquantum.cutensornet.experimental.NetworkState.compute_state_vector()
  - cuquantum.cutensornet.experimental.NetworkState.compute_expectation()
Planned Changes (Upcoming in Next Release):
- All low-level APIs for cuStateVec and cuTensorNet will be migrated to cuquantum.bindings under corresponding submodules cuquantum.bindings.custatevec and cuquantum.bindings.cutensornet while the API names and signatures will remain unchanged.
- A new module cuquantum.tensornet will be released to host all high-level APIs for tensor network computations including the following:
  - cuquantum.contract()
  - cuquantum.contract_path()
  - cuquantum.einsum()
  - cuquantum.einsum_path()
  - cuquantum.Network
  - cuquantum.NetworkOptions
  - cuquantum.CircuitToEinsum
  - cuquantum.OptimizerInfo
  - cuquantum.OptimizerOptions
  - cuquantum.PathFinderOptions
  - cuquantum.ReconfigOptions
  - cuquantum.SlicerOptions
  - cuquantum.cutensornet.tensor
  - cuquantum.cutensornet.experimental
Note: The planned changes outlined above are subject to change and may differ from the final implementation at the time of release. A detailed migration guide will be provided in the upcoming release. We welcome your feedback on these planned changes on NVIDIA/cuQuantum GitHub Discussions. The existing APIs will be deprecated but will remain supported for a few releases to ensure a smooth transition before being removed.

cuQuantum Python v24.08.0¶

Add new APIs and functionalities:
- For low-level APIs, please refer to the release notes of cuTensorNet v2.5.0.
- Introduction of experimental APIs under cuquantum.cutensornet.experimental, which includes:
  - cuquantum.cutensornet.experimental.NetworkState, which enables direct usage of cuTensorNet state APIs to simulate pure tensor network states using various methods. Users can initiate simulations either from a fully parameterized circuit, such as qiskit.QuantumCircuit or cirq.Circuit, or gradually build the tensor network state by applying tensor operators to the cuquantum.cutensornet.experimental.NetworkState object.
  - cuquantum.cutensornet.experimental.TNConfig, which supports contraction-based tensor network simulation for the cuquantum.cutensornet.experimental.NetworkState when provided via the config parameter.
  - cuquantum.cutensornet.experimental.MPSConfig, which supports matrix product states (MPS) based simulation for the cuquantum.cutensornet.experimental.NetworkState when provided via the config parameter.
  - cuquantum.cutensornet.experimental.NetworkOperator, which allows the construction of custom network operators by adding either tensor network operator products or matrix product operators (MPO). The network operator object can then interact with a network state object to apply MPOs or compute the expectation value of the network operator.
  For detailed introduction to these new APIs, please refer to tensor network simulator or the NetworkState examples directory
  
  Note: All new experimental APIs may be subject to change in a future release. Kindly share your feedback with us on NVIDIA/cuQuantum GitHub Discussions!
Bugs fixed:
- Fix a bug for cuquantum.NetworkOptions.compute_type to support cuquantum.cutensornet.ComputeType.COMPUTE_3XTF32.

Known issues:

In PyPI, our packages were released with an additional post-release tag in the version:
```
A.B.C.D
~~~~~~↑
```
To let pip select the correct post-release tag (and corresponding version), use this command:
```
pip install cuquantum-python~=A.B.C
pip install cuquantum~=A.B.C
```
For example, any of the following commands will correctly resolve the additional version component:
```
pip install cuquantum-python==24.8.0.2
pip install cuquantum-python~=24.8.0
pip install cuquantum==24.8.0.2
pip install cuquantum~=24.8.0
```

Compatibility notes:

cuQuantum Python now requires Python 3.10+.
cuQuantum Python now supports NumPy 2.0.

cuQuantum Python v24.03.0¶

Add new APIs and functionalities:

For low-level APIs, please refer to the release notes of cuStateVec v1.6.0 and cuTensorNet v2.4.0.

Starting this release, we offer experimental support for C symbol exposure to Cython users, through the new, cimport-able cycustatevec and cycutensornet modules.

Several APIs and objects are renamed for better consistency:

Old	New
`collapse_by_bitstring()`	`collapse_by_bit_string()`
`collapse_by_bitstring_batched_get_workspace_size()`	`collapse_by_bit_string_batched_get_workspace_size()`
`collapse_by_bitstring_batched()`	`collapse_by_bit_string_batched()`
`Collapse`	`CollapseOp`
`contraction_optimizer_config_get_attribute_dtype()`	`get_contraction_optimizer_config_attribute_dtype()`
`contraction_optimizer_info_get_attribute_dtype()`	`get_contraction_optimizer_info_attribute_dtype()`
`contraction_autotune_preference_get_attribute_dtype()`	`get_contraction_autotune_preference_attribute_dtype()`
`tensor_svd_config_get_attribute_dtype()`	`get_tensor_svd_config_attribute_dtype()`
`tensor_svd_info_get_attribute_dtype()`	`get_tensor_svd_info_attribute_dtype()`
`network_get_attribute_dtype()`	`get_network_attribute_dtype()`
`marginal_get_attribute_dtype()`	`get_marginal_attribute_dtype()`
`sampler_get_attribute_dtype()`	`get_sampler_attribute_dtype()`
`state_get_attribute_dtype()`	`get_state_attribute_dtype()`
`accessor_get_attribute_dtype()`	`get_accessor_attribute_dtype()`
`expectation_get_attribute_dtype()`	`get_expectation_attribute_dtype()`

The old APIs still exist and are functional, but they are considered deprecated and subject to removal in a future release.

Added options for resource management (see Resource management).
- The execution methods of Network, including autotune(), contract(), and gradients(), now accept an optional release_workspace argument, to request that the workspace used for execution be released on function return.
- The reset_operands() method now accepts passing None for the operands argument to request that the internal reference to the network operands be released.

Bugs fixed:
- Fix certain failure for cuquantum.CircuitToEinsum to parse qiskit.QuantumCircuit with complex standard gates.

Known issues:

In PyPI, our packages were released with an additional post-release tag in the version:
```
A.B.C.D
~~~~~~↑
```
To let pip select the correct post-release tag (and corresponding version), use this command:
```
pip install cuquantum-python~=A.B.C
pip install cuquantum~=A.B.C
```
For cuQuantum SDK 24.3.0, all (meta) packages were provisioned a post-release tag: 24.3.0.post1. To illustrate, any of the following commands will correctly resolve the additional version component:
```
pip install cuquantum-python==24.3.0.post1
pip install cuquantum-python~=24.3.0
pip install cuquantum==24.3.0post1
pip install cuquantum~=24.3.0
```

Compatibility notes:

cuquantum.CircuitToEinsum now supports qiskit <= v1.0.0.
cuQuantum Python now requires CuPy v13+
cuQuantum Python now supports Python 3.12.
- In the next release, Python 3.9 will be dropped to follow NEP-29. (This refers to the pre-built wheels on PyPI.org and the Conda packages on conda-forge. If you have any needs for pre-built support, please reach out to us on GitHub. Alternatively, you may build from source, although we might not guarantee indefinite support for source compatibility.)
cuQuantum will drop support for RHEL 7 in the following cuQuantum release. Please plan ahead with this in mind. Thank you.

cuQuantum Python v23.10.0¶

Add new APIs and functionalities:
- For low-level APIs, please refer to the release notes of cuStateVec v1.5.0 and cuTensorNet v2.3.0.
- The function cuquantum.contract() now works like a native PyTorch operator as far as autograd is concerned, if the input operands are PyTorch tensors. This is an experimental feature.
- A new, experimental method cuquantum.Network.gradients() is added for computing the gradients of the network with respect to the input operands.
  - If the gradients are complex-valued, the convention follows that of PyTorch’s.
- Added a new attribute cuquantum.cutensornet.tensor.SVDMethod.discarded_weight_cutoff to allow SVD truncation based on discarded weight.
- The cuquantum.Network constructor and its reset_operands() method now accept an optional stream argument.
Bugs fixed:
- Fix potential data corruption when reset_operands() is called when the provided operands don’t outlive the contraction operation.
- For the case of using CPU arrays (from NumPy/PyTorch) as input operands for contraction, the internal streams were not be properly ordered.
- The methods autotune(), contract() and the standalone function contract() allow passing the pointer address for the stream argument, as promised in the docs.
- The attribute dtypes for cuquantum.cutensornet.MarginalAttribute.OPT_NUM_HYPER_SAMPLES and cuquantum.cutensornet.SamplerAttribute.OPT_NUM_HYPER_SAMPLES are fixed.
Other changes:
- If Python logging is enabled, cuTensorNet’s run-time (instead of build-time) version is reported.
- For passing PyTorch tensors to contraction APIs, the tensor flags .is_conj() and .requires_grad are now taken into account, unless a user explicitly overwrites them with the qualifiers argument.

cuQuantum Python v23.06.0¶

Add new APIs and functionalities:
- For low-level APIs, please refer to the release notes of cuStateVec v1.4.0 and cuTensorNet v2.2.0.
  - Complex-valued gradients, as returned by the experimental API cuquantum.cutensornet.compute_gradients_backward(), differ by a complex conjugate from PyTorch’s convention.
- New attribute cuquantum.cutensornet.tensor.SVDMethod.algorithm that allows users to choose between various SVD algorithms including "gesvd", "gesvdj", "gesvdr" and "gesvdp". For "gesvdj" and "gesvdr", users may also provide customized settings (e.g, tolerance for "gesvdj" algorithm).
- New attribute cuquantum.cutensornet.tensor.SVDInfo.algorithm that describes the SVD algorithm used in the SVD computation. For "gesvdj" and "gesvdp", users may also access execution information (e.g, residual for "gesvdj" algorithm).
Bugs fixed:
- Fix a bug for the auto blocking option for cuquantum.cutensornet.tensor.decompose() and cuquantum.cutensornet.experimental.contract_decompose().
- Fix a bug for cuquantum.CircuitToEinsum to account for the potential global phase when parsing qiskit.QuantumCircuit.
- Fix a bug for cuquantum.CircuitToEinsum to parse custom gates given qiskit.QuantumCircuit.
Other changes:
- Improved the Jupyter notebook for MPS demo by including the density-matrix based MPS-MPO contraction algorithm.
- Avoid using any whitespace unicode characters as TN symbols in cuquantum.CircuitToEinsum.
- The path finding algorithm cuquantum.PathFinderOptions now takes advantage of the new smart option to limit the pathfinder elapsed time (see CUTENSORNET_CONTRACTION_OPTIMIZER_CONFIG_SMART_OPTION for details). This change also applies to public APIs including cuquantum.contract(), cuquantum.contract_path() and cuquantum.OptimizerOptions.
- When running the hyper-optimizer to compute the optimal contraction path with cuquantum.contract_path() & co, which could be long-running depending on the problem size, it is now possible to abort via Ctrl-C.
- Conda packages compatible with CUDA 12 are available on conda-forge. Users can specify the target CUDA version using the new cuda-version metapackage if needed. For example, conda install -c conda-forge cuquantum-python cuda-version=12.0 or conda install -c conda-forge cuquantum-python cuda-version=11.8. This support has been backported to cuQuantum Python 23.03.
- Improve pip dependency management: When installing cuquantum-python or cuquantum-python-cu12 via pip, we will attempt to infer the compatible CuPy wheel and install it (with caveats noted below in the 23.03 release). The exception is cuquantum-python-cu11, for which we require users to explicitly pick from cupy-cuda110, cupy-cuda111, or cupy-cuda11x and install it, following CuPy’s installation guide.
  - If installing the meta-package cuquantum-python with pip 23.1+, passing --no-cache-dir to pip is required.

Compatibility notes:

cuQuantum Python now requires Python 3.9+
cuQuantum Python now requires NumPy v1.21+
cuQuantum Python now requires CuPy v10+

Known issues:

Under single precision, when the input tensor/matrix has a low rank, "gesvdr" based tensor SVD may suffer from reduced accuracy.
When "gesvdp" algorithm is used for tensor SVD, user is responsible for checking cuquantum.cutensornet.tensor.SVDInfo.gesvdp_err_sigma to monitor the convergence.

cuQuantum Python v23.03.0¶

Add new APIs and functionalities:
- For low-level APIs, please refer to the release notes of cuStateVec v1.3.0 and cuTensorNet v2.1.0.
- A new module cuquantum.cutensornet.tensor that supports tensor decomposition routines via cuquantum.cutensornet.tensor.decompose() and cuquantum.cutensornet.tensor.DecompositionOptions. The new module is also directly accessible via the cuquantum namespace. The following decomposition methods are supported:
  - QR decomposition via cuquantum.cutensornet.tensor.QRMethod.
  - Exact and approximate singular value decomposition (SVD) via cuquantum.cutensornet.tensor.SVDMethod. For approximate SVD, run-time information on truncation is stored in and accessible via cuquantum.cutensornet.tensor.SVDInfo.
- A new module cuquantum.cutensornet.experimental with experimental APIs, including cuquantum.cutensornet.experimental.contract_decompose(), cuquantum.cutensornet.experimental.ContractDecomposeAlgorithm and cuquantum.cutensornet.experimental.ContractDecomposeInfo. These experimental APIs can be used to perform compound contraction and decomposition operations. Note that all new experimental APIs may be subject to change in a future release. Kindly share your feedback with us on NVIDIA/cuQuantum GitHub Discussions!
- A new attribute cuquantum.CircuitToEinsum.gates is added to allow users to access gate operands from cuquantum.CircuitToEinsum.
API changes:
- The fixed kwarg support in cuquantum.CircuitToEinsum.state_vector() is removed. The same functionality can be achieved via the same fixed kwarg in cuquantum.CircuitToEinsum.batched_amplitudes().
Bugs fixed:
- The output mode labels were not lexicographically ordered when the Einstein summation expression is provided in implicit form (this is a regression from cuQuantum Python v0.1.0.1).
- Fix the parallel contraction failure when using MPICH (NVIDIA/cuQuantum#31).
Other changes:
- cuQuantum Python now supports Python 3.11.
- cuQuantum Python now supports CUDA 12.
- A set of new wheels with suffix -cu12 are released on PyPI.org for CUDA 12 users.
  - Example: pip install cuquantum-python-cu12 cupy-cuda12x for setting up a wheel-based environment compatible with CUDA 12
  - The existing cuquantum and cuquantum-python wheels (without the -cuXX suffix) are turned into automated installers that will attempt to detect the current CUDA environment and install the appropriate wheels. Please note that this automated detection may encounter conditions under which detection is unsuccessful, especially in a CPU-only environment (such as CI/CD). If detection fails we assume that the target environment is CUDA 11 and proceed. This assumption may be changed in a future release, and in such cases we recommend that users explicitly (manually) install the correct wheels.
- For conda packages, currently CUDA 12 support is pending the NVIDIA-led community effort (conda-forge/staged-recipes#21382). Once conda-forge supports CUDA 12 we will make compatible conda packages available.
- CUDA Lazy Loading is supported. This can significantly reduce memory footprint by deferring the loading of needed GPU kernels to the first call sites. This feature requires CUDA 11.8 (or above) and cuTENSOR 1.7.0 (or above). Please refer to the CUDA documentation for other requirements and details. Currently this feature requires users to opt in by setting the environment variable CUDA_MODULE_LOADING=LAZY. In a future CUDA version, lazy loading may become the default.
  - If you’re a wheel user, update your environment with pip install "cutensor-cuXX>=1.7" (XX = 11 or 12).
  - If you’re a conda user, update your environment with conda install -c conda-forge "cudatoolkit>=11.8" "cutensor>=1.7" (for CUDA 11).
- Our support policy is clarified, see Compatibility policy.

Compatibility notes:

cuQuantum Python requires Python 3.8+
- In the next release, Python 3.8 will be dropped to follow NEP-29. (This refers to the pre-built wheels on PyPI.org and the Conda packages on conda-forge. If you have any needs for pre-built support, please reach out to us on GitHub. Alternatively, you may build from source, although we might not guarantee indefinite support for source compatibility.)
cuQuantum Python requires NumPy v1.19+
- In the next release, NumPy 1.19 & 1.20 will be dropped to follow NEP-29.
cuQuantum Python requires CuPy v9.5+
- In the next release, CuPy v9 will be dropped to be consistent with NEP-29.

cuQuantum Python v22.11.0.1¶

This is a hot-fix release addressing a few issues in cuQuantum Python.

Bugs fixed:
- Fix performance degradation in cuquantum.contract() that could impact certain usage patterns.
- Fix the .save_statevector() usage in the Jupyter notebook qiskit_basic.ipynb.
- Remove invalid code.

cuQuantum Python v22.11.0¶

We are on NVIDIA/cuQuantum GitHub Discussions! For any questions regarding (or to share any exciting work built upon) cuQuantum, please feel free to reach out to us on GitHub Discussions.
- Bug reports should still go to our GitHub issue tracker.
Add new APIs and functionalities:
- For cuTensorNet low-level APIs, please refer to the release notes of cuTensorNet v2.0.0; no new cuStateVec API is added.
- A new API, cuquantum.CircuitToEinsum.batched_amplitudes() to compute the amplitudes for a batch of qubits. This is equivalent to the kwargs fixed support in cuquantum.CircuitToEinsum.state_vector(), which is deprecated and will be removed in a future release.
- A new API, cuquantum.CircuitToEinsum.expectation() to support expectation value computation for Pauli strings.
- A new helper API, cuquantum.cutensornet.get_mpi_comm_pointer() to get the pointer to and size of MPI communicator for the new low-level API cuquantum.cutensornet.distributed_reset_configuration() that enables distributed parallelism. This capability requires mpi4py.
- The cuquantum module now has a command line interface to return the include and library paths and the linker flags for the cuTENSOR and cuQuantum libraries: python -m cuquantum. See Command line support for detail.
- Support for controlling non-blocking behavior via the new option cuquantum.NetworkOptions.blocking.
API changes:
- For cuTensorNet low-level APIs, please refer to the release notes of cuTensorNet v2.0.0; cuStateVec low-level APIs remain unchanged.
  - Users can set the tensor qualifiers by using the dedicated NumPy dtype cuquantum.cutensornet.tensor_qualifiers_dtype. See the Python sample (python/samples/cutensornet/coarse/example21.py) for details. For example, complex conjugation can be done on-the-fly, reducing memory pressure.
  - To get/set the contraction path or slicing configurations, users should use contraction_optimizer_info_get_attribute_dtype() to get a NumPy custom dtype representing the path or slicing configuration object, in a manner consistent with the method used for all other attributes. Refer to the docstring for details. The (experimental) ContractionPath object is removed.
Functionality/performance improvements:
- Improved performance when contracting two tensors using cuquantum.contract() or related APIs.
- Improved performance for reusing a Network object with reset_operands().
- The lightcone construction in CircuitToEinsum is improved to further reduce the number of tensors in the network.
- The build system now supports PEP-517 and standard pip command-line flags. The environment variable CUQUANTUM_IGNORE_SOLVER is no longer used. See installation from source for more information.
Bugs fixed:
- Fix a potential multi-device bug in the internal device context switch.
- Fix a bug for using invalid mode labels in cuquantum.CircuitToEinsum when input circuit size gets large.
Other changes:
- Provide one more distributed (MPI+NCCL) Python sample (example4_mpi_nccl.py) to show how to use cuTensorNet and create parallelism.
- The test infrastructure will show tests that are not runnable as “deselected” instead of “skipped”.
- A new pip wheel is released on PyPI: pip install cuquantum-python-cu11. Users can still install cuQuantum Python via pip install cuquantum-python, as before. cuquantum-python now becomes a meta-wheel pointing to cuquantum-python-cu11. This may change in a future release when a new CUDA version becomes available. Using wheels with the -cuXX suffix is encouraged.

Compatibility notes:

cuQuantum Python now requires cuStateVec 1.1.0 or above.
cuQuantum Python now requires cuTensorNet 2.0.0 or above.
cuQuantum Python now requires cuTENSOR 1.6.1 or above.

cuQuantum Python v22.07.1¶

Bugs fixed:
- The 22.07.0 cuquantum wheel had a wrong file layout. (If you are using the cuquantum 22.07.0.1 or 22.07.0.2 hot-fix wheel, they will work fine.)

cuQuantum Python v22.07.0¶

Add new APIs and functionalities:
- For low-level APIs, please refer to the release notes of cuStateVec v1.1.0 and cuTensorNet v1.1.0.
- New high-level API cuquantum.CircuitToEinsum that supports conversion of qiskit.QuantumCircuit and cirq.Circuit to tensor network contraction:
  - Support state coefficient
  - Support bitstring amplitude
  - Support reduced density matrix
  - Backend support on NumPy, CuPy and PyTorch
- Add a keyword-only argument slices to the cuquantum.Network.contract() method to support contracting an arbitrary subset of the slices.
- Add a new attribute intermediate_modes to the cuquantum.OptimizerInfo object for retrieving the mode labels of all intermediate tensors.
- Add a new attribute num_slices to the cuquantum.OptimizerInfo object for querying the total number of slices.
Functionality/performance improvements:
- Improve the einsum expression parser.
Bugs fixed:
- An exception mistakenly raised in cuquantum.einsum() when optimize is set to False.
- Missing f-specifier in the string representation of cuquantum.OptimizerInfo.
Other changes:
- Drop the dependency on typing_extensions.
- Provide distributed (MPI-based) Python samples that show how easy it is to use cuTensorNet and create parallelism. mpi4py is required for running these samples.
- Update the low-level, non-distributed sample tensornet_example.py by improving memory usage and switching to the new contraction API contract_slices().
- Provide Jupyter notebooks to show how to convert a quantum circuit to a tensor network contraction.
- Add a Python sample to illustrate the usage of the new multi-device bit-swapping multi_device_swap_index_bits() API.
- Restructure the samples folder to separate cuStateVec and cuTensorNet samples.

Compatibility notes:

cuQuantum Python now requires cuQuantum v22.07.
cuQuantum Python now requires Python 3.8+.
cuQuantum Python now requires NumPy v1.19+.
cuQuantum Python supports Cirq v0.6.0+.
cuQuantum Python supports Qiskit v0.24.0+.

cuQuantum Python v22.05.0¶

Bugs fixed:
- Make typing_extensions a required dependency (NVIDIA/cuQuantum#3)
- Fix issues in the test suite
Other changes:
- The Python sample (python/samples/tensornet_example.py) is updated to include a correctness check

cuQuantum Python v22.03.0¶

Stable release:
- Starting this release, cuQuantum Python switches to the CalVer versioning scheme, following cuQuantum SDK
- pip wheels are released on PyPI: pip install cuquantum-python
Functionality/performance improvements:
- High-level tensor network APIs are now fully NumPy compliant:
  - Support generalized einsum expressions
  - Support ellipsis
  - Support broadcasting
Add new APIs and functionalities for:
- For low-level APIs, please refer to the release notes of cuStateVec v1.0.0 and cuTensorNet v1.0.0.
- The high-level APIs support an EMM-like memory plugin interface (see External memory management).
API changes:
- For low-level APIs, please refer to the release notes of cuStateVec v1.0.0 and cuTensorNet v1.0.0.
- No API breaking changes for the high-level APIs.

Compatibility notes:

cuQuantum Python requires cuQuantum v22.03
cuQuantum Python requires Python 3.7+
- In the next release, Python 3.7 will be dropped to follow NEP-29.
cuQuantum Python requires NumPy v1.17+
- In the next release, NumPy 1.17 & 1.18 will be dropped to follow NEP-29.
cuQuantum Python requires CuPy v9.5+
cuQuantum Python supports PyTorch v1.10+

Known issues:

If you install cuQuantum Python from PyPI (pip install cuquantum-python), make sure you also install typing_extensions (via pip or conda). This only affects the wheel installation and will be fixed in the next release. (NVIDIA/cuQuantum#3)

cuQuantum Python v0.1.0.1¶

Patch release:
- Add a __version__ string

cuQuantum Python v0.1.0.0¶

Initial release (beta 2)

Compatibility notes:

cuQuantum Python requires cuQuantum v0.1.0
cuQuantum Python requires NumPy v1.17+
cuQuantum Python requires CuPy v9.5+
cuQuantum Python supports PyTorch v1.10+

Limitation notes:

In certain environments, if PyTorch is installed import cuquantum could fail (with a segmentation fault). It is currently under investigation and a temporary workaround is to import torch before importing cuquantum.