Release Notes¶
cuQuantum Python v23.06.0¶
Add new APIs and functionalities:
For low-level APIs, please refer to the release notes of cuStateVec v1.4.0 and cuTensorNet v2.2.0.
Complex-valued gradients, as returned by the experimental API
cuquantum.cutensornet.compute_gradients_backward(), differ by a complex conjugate from PyTorch’s convention.
New attribute
cuquantum.cutensornet.tensor.SVDMethod.algorithmthat allows users to choose between various SVD algorithms including"gesvd","gesvdj","gesvdr"and"gesvdp". For"gesvdj"and"gesvdr", users may also provide customized settings (e.g, tolerance for"gesvdj"algorithm).New attribute
cuquantum.cutensornet.tensor.SVDInfo.algorithmthat describes the SVD algorithm used in the SVD computation. For"gesvdj"and"gesvdp", users may also access execution information (e.g, residual for"gesvdj"algorithm).
Bugs fixed:
Fix a bug for the auto blocking option for
cuquantum.cutensornet.tensor.decompose()andcuquantum.cutensornet.experimental.contract_decompose().Fix a bug for
cuquantum.CircuitToEinsumto account for the potential global phase when parsingqiskit.QuantumCircuit.Fix a bug for
cuquantum.CircuitToEinsumto parse custom gates givenqiskit.QuantumCircuit.
Other changes:
Improved the Jupyter notebook for MPS demo by including the density-matrix based MPS-MPO contraction algorithm.
Avoid using any whitespace unicode characters as TN symbols in
cuquantum.CircuitToEinsum.The path finding algorithm
cuquantum.PathFinderOptionsnow takes advantage of the new smart option to limit the pathfinder elapsed time (seeCUTENSORNET_CONTRACTION_OPTIMIZER_CONFIG_SMART_OPTIONfor details). This change also applies to public APIs includingcuquantum.contract(),cuquantum.contract_path()andcuquantum.OptimizerOptions.When running the hyper-optimizer to compute the optimal contraction path with
cuquantum.contract_path()& co, which could be long-running depending on the problem size, it is now possible to abort via Ctrl-C.Conda packages compatible with CUDA 12 are available on conda-forge. Users can specify the target CUDA version using the new
cuda-versionmetapackage if needed. For example,conda install -c conda-forge cuquantum-python cuda-version=12.0orconda install -c conda-forge cuquantum-python cuda-version=11.8. This support has been backported to cuQuantum Python 23.03.Improve pip dependency management: When installing
cuquantum-pythonorcuquantum-python-cu12via pip, we will attempt to infer the compatible CuPy wheel and install it (with caveats noted below in the 23.03 release). The exception iscuquantum-python-cu11, for which we require users to explicitly pick fromcupy-cuda110,cupy-cuda111, orcupy-cuda11xand install it, following CuPy’s installation guide.If installing the meta-package
cuquantum-pythonwith pip 23.1+, passing--no-cache-dirto pip is required.
Compatibility notes:
cuQuantum Python now requires Python 3.9+
cuQuantum Python now requires NumPy v1.21+
cuQuantum Python now requires CuPy v10+
Known issues:
Under single precision, when the input tensor/matrix has a low rank,
"gesvdr"based tensor SVD may suffer from reduced accuracy.When
"gesvdp"algorithm is used for tensor SVD, user is responsible for checkingcuquantum.cutensornet.tensor.SVDInfo.gesvdp_err_sigmato monitor the convergence.
cuQuantum Python v23.03.0¶
Add new APIs and functionalities:
For low-level APIs, please refer to the release notes of cuStateVec v1.3.0 and cuTensorNet v2.1.0.
A new module
cuquantum.cutensornet.tensorthat supports tensor decomposition routines viacuquantum.cutensornet.tensor.decompose()andcuquantum.cutensornet.tensor.DecompositionOptions. The new module is also directly accessible via thecuquantumnamespace. The following decomposition methods are supported:QR decomposition via
cuquantum.cutensornet.tensor.QRMethod.Exact and approximate singular value decomposition (SVD) via
cuquantum.cutensornet.tensor.SVDMethod. For approximate SVD, run-time information on truncation is stored in and accessible viacuquantum.cutensornet.tensor.SVDInfo.
A new module
cuquantum.cutensornet.experimentalwith experimental APIs, includingcuquantum.cutensornet.experimental.contract_decompose(),cuquantum.cutensornet.experimental.ContractDecomposeAlgorithmandcuquantum.cutensornet.experimental.ContractDecomposeInfo. These experimental APIs can be used to perform compound contraction and decomposition operations. Note that all new experimental APIs may be subject to change in a future release. Kindly share your feedback with us on NVIDIA/cuQuantum GitHub Discussions!A new attribute
cuquantum.CircuitToEinsum.gatesis added to allow users to access gate operands fromcuquantum.CircuitToEinsum.
API changes:
The
fixedkwarg support incuquantum.CircuitToEinsum.state_vector()is removed. The same functionality can be achieved via the samefixedkwarg incuquantum.CircuitToEinsum.batched_amplitudes().
Bugs fixed:
The output mode labels were not lexicographically ordered when the Einstein summation expression is provided in implicit form (this is a regression from cuQuantum Python v0.1.0.1).
Fix the parallel contraction failure when using MPICH (NVIDIA/cuQuantum#31).
Other changes:
cuQuantum Python now supports Python 3.11.
cuQuantum Python now supports CUDA 12.
A set of new wheels with suffix
-cu12are released on PyPI.org for CUDA 12 users.Example:
pip install cuquantum-python-cu12 cupy-cuda12xfor setting up a wheel-based environment compatible with CUDA 12The existing
cuquantumandcuquantum-pythonwheels (without the-cuXXsuffix) are turned into automated installers that will attempt to detect the current CUDA environment and install the appropriate wheels. Please note that this automated detection may encounter conditions under which detection is unsuccessful, especially in a CPU-only environment (such as CI/CD). If detection fails we assume that the target environment is CUDA 11 and proceed. This assumption may be changed in a future release, and in such cases we recommend that users explicitly (manually) install the correct wheels.
For conda packages, currently CUDA 12 support is pending the NVIDIA-led community effort (conda-forge/staged-recipes#21382). Once conda-forge supports CUDA 12 we will make compatible conda packages available.
CUDA Lazy Loading is supported. This can significantly reduce memory footprint by deferring the loading of needed GPU kernels to the first call sites. This feature requires CUDA 11.8 (or above) and cuTENSOR 1.7.0 (or above). Please refer to the CUDA documentation for other requirements and details. Currently this feature requires users to opt in by setting the environment variable
CUDA_MODULE_LOADING=LAZY. In a future CUDA version, lazy loading may become the default.If you’re a wheel user, update your environment with
pip install "cutensor-cuXX>=1.7"(XX= 11 or 12).If you’re a conda user, update your environment with
conda install -c conda-forge "cudatoolkit>=11.8" "cutensor>=1.7"(for CUDA 11).
Our support policy is clarified, see Compatibility policy.
Compatibility notes:
cuQuantum Python requires Python 3.8+
In the next release, Python 3.8 will be dropped to follow NEP-29. (This refers to the pre-built wheels on PyPI.org and the Conda packages on conda-forge. If you have any needs for pre-built support, please reach out to us on GitHub. Alternatively, you may build from source, although we might not guarantee indefinite support for source compatibility.)
cuQuantum Python requires NumPy v1.19+
In the next release, NumPy 1.19 & 1.20 will be dropped to follow NEP-29.
cuQuantum Python requires CuPy v9.5+
In the next release, CuPy v9 will be dropped to be consistent with NEP-29.
cuQuantum Python v22.11.0.1¶
This is a hot-fix release addressing a few issues in cuQuantum Python.
Bugs fixed:
Fix performance degradation in
cuquantum.contract()that could impact certain usage patterns.Fix the
.save_statevector()usage in the Jupyter notebookqiskit_basic.ipynb.Remove invalid code.
cuQuantum Python v22.11.0¶
We are on NVIDIA/cuQuantum GitHub Discussions! For any questions regarding (or to share any exciting work built upon) cuQuantum, please feel free to reach out to us on GitHub Discussions.
Bug reports should still go to our GitHub issue tracker.
Add new APIs and functionalities:
For cuTensorNet low-level APIs, please refer to the release notes of cuTensorNet v2.0.0; no new cuStateVec API is added.
A new API,
cuquantum.CircuitToEinsum.batched_amplitudes()to compute the amplitudes for a batch of qubits. This is equivalent to the kwargsfixedsupport incuquantum.CircuitToEinsum.state_vector(), which is deprecated and will be removed in a future release.A new API,
cuquantum.CircuitToEinsum.expectation()to support expectation value computation for Pauli strings.A new helper API,
cuquantum.cutensornet.get_mpi_comm_pointer()to get the pointer to and size of MPI communicator for the new low-level APIcuquantum.cutensornet.distributed_reset_configuration()that enables distributed parallelism. This capability requiresmpi4py.The
cuquantummodule now has a command line interface to return the include and library paths and the linker flags for the cuTENSOR and cuQuantum libraries:python -m cuquantum. See Command Line Support for detail.Support for controlling non-blocking behavior via the new option
cuquantum.NetworkOptions.blocking.
API changes:
For cuTensorNet low-level APIs, please refer to the release notes of cuTensorNet v2.0.0; cuStateVec low-level APIs remain unchanged.
Users can set the tensor qualifiers by using the dedicated NumPy dtype
cuquantum.cutensornet.tensor_qualifiers_dtype. See the Python sample (python/samples/cutensornet/coarse/example21.py) for details. For example, complex conjugation can be done on-the-fly, reducing memory pressure.To get/set the contraction path or slicing configurations, users should use
contraction_optimizer_info_get_attribute_dtype()to get a NumPy custom dtype representing the path or slicing configuration object, in a manner consistent with the method used for all other attributes. Refer to the docstring for details. The (experimental)ContractionPathobject is removed.
Functionality/performance improvements:
Improved performance when contracting two tensors using
cuquantum.contract()or related APIs.Improved performance for reusing a
Networkobject withreset_operands().The lightcone construction in
CircuitToEinsumis improved to further reduce the number of tensors in the network.The build system now supports PEP-517 and standard
pipcommand-line flags. The environment variableCUQUANTUM_IGNORE_SOLVERis no longer used. See Getting Started for more information.
Bugs fixed:
Fix a potential multi-device bug in the internal device context switch.
Fix a bug for using invalid mode labels in
cuquantum.CircuitToEinsumwhen input circuit size gets large.
Other changes:
Provide one more distributed (MPI+NCCL) Python sample (
example4_mpi_nccl.py) to show how to use cuTensorNet and create parallelism.The test infrastructure will show tests that are not runnable as “deselected” instead of “skipped”.
A new pip wheel is released on PyPI:
pip install cuquantum-python-cu11. Users can still install cuQuantum Python viapip install cuquantum-python, as before.cuquantum-pythonnow becomes a meta-wheel pointing tocuquantum-python-cu11. This may change in a future release when a new CUDA version becomes available. Using wheels with the-cuXXsuffix is encouraged.
Compatibility notes:
cuQuantum Python now requires cuStateVec 1.1.0 or above.
cuQuantum Python now requires cuTensorNet 2.0.0 or above.
cuQuantum Python now requires cuTENSOR 1.6.1 or above.
cuQuantum Python v22.07.1¶
Bugs fixed:
The 22.07.0
cuquantumwheel had a wrong file layout. (If you are using thecuquantum22.07.0.1 or 22.07.0.2 hot-fix wheel, they will work fine.)
cuQuantum Python v22.07.0¶
Add new APIs and functionalities:
For low-level APIs, please refer to the release notes of cuStateVec v1.1.0 and cuTensorNet v1.1.0.
New high-level API
cuquantum.CircuitToEinsumthat supports conversion ofqiskit.QuantumCircuitandcirq.Circuitto tensor network contraction:Support state coefficient
Support bitstring amplitude
Support reduced density matrix
Backend support on NumPy, CuPy and PyTorch
Add a keyword-only argument
slicesto thecuquantum.Network.contract()method to support contracting an arbitrary subset of the slices.Add a new attribute
intermediate_modesto thecuquantum.OptimizerInfoobject for retrieving the mode labels of all intermediate tensors.Add a new attribute
num_slicesto thecuquantum.OptimizerInfoobject for querying the total number of slices.
Functionality/performance improvements:
Improve the einsum expression parser.
Bugs fixed:
An exception mistakenly raised in
cuquantum.einsum()whenoptimizeis set toFalse.Missing f-specifier in the string representation of
cuquantum.OptimizerInfo.
Other changes:
Drop the dependency on
typing_extensions.Provide distributed (MPI-based) Python samples that show how easy it is to use cuTensorNet and create parallelism.
mpi4pyis required for running these samples.Update the low-level, non-distributed sample
tensornet_example.pyby improving memory usage and switching to the new contraction APIcontract_slices().Provide Jupyter notebooks to show how to convert a quantum circuit to a tensor network contraction.
Add a Python sample to illustrate the usage of the new multi-device bit-swapping
multi_device_swap_index_bits()API.Restructure the
samplesfolder to separate cuStateVec and cuTensorNet samples.
Compatibility notes:
cuQuantum Python now requires cuQuantum v22.07.
cuQuantum Python now requires Python 3.8+.
cuQuantum Python now requires NumPy v1.19+.
cuQuantum Python supports Cirq v0.6.0+.
cuQuantum Python supports Qiskit v0.24.0+.
cuQuantum Python v22.05.0¶
Bugs fixed:
Make
typing_extensionsa required dependency (NVIDIA/cuQuantum#3)Fix issues in the test suite
Other changes:
The Python sample (python/samples/tensornet_example.py) is updated to include a correctness check
cuQuantum Python v22.03.0¶
Stable release:
Starting this release, cuQuantum Python switches to the CalVer versioning scheme, following cuQuantum SDK
pipwheels are released on PyPI:pip install cuquantum-python
Functionality/performance improvements:
High-level tensor network APIs are now fully NumPy compliant:
Support generalized einsum expressions
Support ellipsis
Support broadcasting
Add new APIs and functionalities for:
For low-level APIs, please refer to the release notes of cuStateVec v1.0.0 and cuTensorNet v1.0.0.
The high-level APIs support an EMM-like memory plugin interface (see Memory management).
API changes:
For low-level APIs, please refer to the release notes of cuStateVec v1.0.0 and cuTensorNet v1.0.0.
No API breaking changes for the high-level APIs.
Compatibility notes:
cuQuantum Python requires cuQuantum v22.03
cuQuantum Python requires Python 3.7+
In the next release, Python 3.7 will be dropped to follow NEP-29.
cuQuantum Python requires NumPy v1.17+
In the next release, NumPy 1.17 & 1.18 will be dropped to follow NEP-29.
cuQuantum Python requires CuPy v9.5+
cuQuantum Python supports PyTorch v1.10+
Known issues:
If you install cuQuantum Python from PyPI (
pip install cuquantum-python), make sure you also installtyping_extensions(viapiporconda). This only affects the wheel installation and will be fixed in the next release. (NVIDIA/cuQuantum#3)
cuQuantum Python v0.1.0.1¶
Patch release:
Add a
__version__string
cuQuantum Python v0.1.0.0¶
Initial release (beta 2)
Compatibility notes:
cuQuantum Python requires cuQuantum v0.1.0
cuQuantum Python requires NumPy v1.17+
cuQuantum Python requires CuPy v9.5+
cuQuantum Python supports PyTorch v1.10+
Limitation notes:
In certain environments, if PyTorch is installed
import cuquantumcould fail (with a segmentation fault). It is currently under investigation and a temporary workaround is to importtorchbefore importingcuquantum.