cuquantum.Network¶
- class cuquantum.Network(subscripts, *operands, qualifiers=None, options=None, stream=None)[source]¶
Create a tensor network object specified as an Einstein summation expression.
The Einstein summation convention provides an elegant way of representing many tensor network operations. This object allows the user to invest considerable effort into computing the best contraction path as well as autotuning the contraction upfront for repeated contractions over the same network topology (different input tensors, or “operands”, with the same Einstein summation expression). Also see
contract_path()
andautotune()
.For the Einstein summation expression, both the explicit and implicit forms are supported.
In the implicit form, the output mode labels are inferred from the summation expression and reordered lexicographically. An example is the expression
'ij,jh'
, for which the output mode labels are'hi'
. (This corresponds to a matrix multiplication followed by a transpose.)In the explicit form, output mode labels can be directly stated following the identifier
'->'
in the summation expression. An example is the expression'ij,jh->ih'
(which corresponds to a matrix multiplication).To specify an Einstein summation expression, both the subscript format (as shown above) and the interleaved format are supported.
The interleaved format is an alternative way for specifying the operands and their mode labels as
Network(op0, modes0, op1, modes1, ..., [modes_out])
, whereopN
is the N-th operand andmodesN
is a sequence of hashable and comparable objects (strings, integers, etc) representing the N-th operand’s mode labels.Ellipsis broadcasting is supported.
Additional information on various operations on the network can be obtained by passing in a
logging.Logger
object toNetworkOptions
or by setting the appropriate options in the root logger object, which is used by default:>>> import logging >>> logging.basicConfig(level=logging.INFO, format='%(asctime)s %(levelname)-8s %(message)s', datefmt='%m-%d %H:%M:%S')
- Parameters
subscripts – The mode labels (subscripts) defining the Einstein summation expression as a comma-separated sequence of characters. Unicode characters are allowed in the expression thereby expanding the size of the tensor network that can be specified using the Einstein summation convention.
operands – A sequence of tensors (ndarray-like objects). The currently supported types are
numpy.ndarray
,cupy.ndarray
, andtorch.Tensor
.qualifiers – Specify the tensor qualifiers as a
numpy.ndarray
oftensor_qualifiers_dtype
objects of length equal to the number of operands.options – Specify options for the tensor network as a
NetworkOptions
object. Alternatively, adict
containing the parameters for theNetworkOptions
constructor can also be provided. If not specified, the value will be set to the default-constructedNetworkOptions
object.stream – Provide the CUDA stream to use for network construction, which is needed for stream-ordered operations such as allocating memory. Acceptable inputs include
cudaStream_t
(as Pythonint
),cupy.cuda.Stream
, andtorch.cuda.Stream
. If a stream is not provided, the current stream will be used.
See also
Examples
>>> from cuquantum import Network >>> import numpy as np
Define the parameters of the tensor network:
>>> expr = 'ehl,gj,edhg,bif,d,c,k,iklj,cf,a->ba' >>> shapes = [(8, 2, 5), (5, 7), (8, 8, 2, 5), (8, 6, 3), (8,), (6,), (5,), (6, 5, 5, 7), (6, 3), (3,)]
Create the input tensors using NumPy:
>>> operands = [np.random.rand(*shape) for shape in shapes]
Create a
Network
object:>>> tn = Network(expr, *operands)
Find the best contraction order:
>>> path, info = tn.contract_path({'samples': 500})
Autotune the network:
>>> tn.autotune(iterations=5)
Perform the contraction. The result is of the same type and on the same device as the operands:
>>> r1 = tn.contract()
Reset operands to new values:
>>> operands = [i*operand for i, operand in enumerate(operands, start=1)] >>> tn.reset_operands(*operands)
Get the result of the new contraction:
>>> r2 = tn.contract() >>> from math import factorial >>> np.allclose(r2, factorial(len(operands))*r1) True
Finally, free network resources. If this call isn’t made, it may hinder further operations (especially if the network is large) since the memory will be released only when the object goes out of scope. (To avoid having to explicitly make this call, it is recommended to use the
Network
object as a context manager.)>>> tn.free()
If the operands are on the GPU, they can also be updated using in-place operations. In this case, the call to
reset_operands()
can be skipped – subsequentcontract()
calls will use the same operands (with updated contents). The following example illustrates this using CuPy operands and also demonstrates the usage of aNetwork
context (so as to skip callingfree()
):>>> import cupy as cp >>> expr = 'ehl,gj,edhg,bif,d,c,k,iklj,cf,a->ba' >>> shapes = [(8, 2, 5), (5, 7), (8, 8, 2, 5), (8, 6, 3), (8,), (6,), (5,), (6, 5, 5, 7), (6, 3), (3,)] >>> operands = [cp.random.rand(*shape) for shape in shapes] >>> >>> with Network(expr, *operands) as tn: ... path, info = tn.contract_path({'samples': 500}) ... tn.autotune(iterations=5) ... ... # Perform the contraction ... r1 = tn.contract() ... ... # Update the operands in place ... for i, operand in enumerate(operands, start=1): ... operand *= i ... ... # Perform the contraction with the updated operand values ... r2 = tn.contract() ... ... # The resources used by the network are automatically released when the context ends. >>> >>> from math import factorial >>> cp.allclose(r2, factorial(len(operands))*r1) array(True)
PyTorch CPU and GPU tensors can be passed as input operands in the same fashion.
To compute the gradients of the network w.r.t. the input operands (NumPy/CuPy/PyTorch), the
gradients()
method can be used. To enable the gradient computation, one shouldcreate the network with the
qualifiers
argumentcall the
contract()
method prior to thegradients()
methodseed the
gradients()
method with the output gradient (see the docs for the requirements)
Below is a minimal example:
>>> from cuquantum import cutensornet as cutn >>> expr = "ijk,jkl,klm,lmn" >>> shapes = ((3, 4, 5), (4, 5, 3), (5, 3, 2), (3, 2, 6)) >>> operands = [cp.random.rand(*shape) for shape in shapes] >>> qualifiers = np.zeros(len(shapes), dtype=cutn.tensor_qualifiers_dtype) >>> qualifiers[:]["requires_gradient"] = 1 # request gradients for all input tensors >>> >>> with Network(expr, *operands, qualifiers=qualifiers) as tn: ... path, info = tn.contract_path() ... ... # Perform the contraction ... r = tn.contract() ... ... # Perform the backprop ... input_grads = tn.gradients(cp.ones_like(r)) ... >>>
For PyTorch CPU/GPU tensors with the
requires_grad
attribute set up, one does not need to pass thequalifiers
argument. Note that thisNetwork
class and its methods are not PyTorch operators and do not add any node to PyTorch’s autograd graph. For a native, differentiable PyTorch operator, use thecuquantum.contract()
function.See
contract()
for more examples on specifying the Einstein summation expression as well as specifying options for the tensor network and the optimizer.Methods
- autotune(*, iterations=3, stream=None, release_workspace=False)[source]¶
Autotune the network to reduce the contraction cost.
This is an optional step that is recommended if the
Network
object is used to perform multiple contractions.- Parameters
iterations – The number of iterations for autotuning. See
CUTENSORNET_CONTRACTION_AUTOTUNE_MAX_ITERATIONS
.stream – Provide the CUDA stream to use for the autotuning operation. Acceptable inputs include
cudaStream_t
(as Pythonint
),cupy.cuda.Stream
, andtorch.cuda.Stream
. If a stream is not provided, the current stream will be used.release_workspace – A value of
True
specifies that theNetwork
object should release workspace memory back to the package memory pool on function return, while a value ofFalse
specifies that theNetwork
object should retain the memory. This option may be set toTrue
if the application performs other operations that consume a lot of memory between successive calls to the (same or different) execution API such asautotune()
,contract()
, orgradients()
, but incurs a small overhead due to obtaining and releasing workspace memory from and to the package memory pool on every call. The default isFalse
.
- contract(*, slices=None, stream=None, release_workspace=False)[source]¶
Contract the network and return the result.
- Parameters
slices – Specify the slices to be contracted as Python
range
for contiguous slice IDs or as a Python sequence object for arbitrary slice IDs. If not specified, all slices will be contracted.stream – Provide the CUDA stream to use for the contraction operation. Acceptable inputs include
cudaStream_t
(as Pythonint
),cupy.cuda.Stream
, andtorch.cuda.Stream
. If a stream is not provided, the current stream will be used.release_workspace – A value of
True
specifies that theNetwork
object should release workspace memory back to the package memory pool on function return, while a value ofFalse
specifies that theNetwork
object should retain the memory. This option may be set toTrue
if the application performs other operations that consume a lot of memory between successive calls to the (same or different) execution API such asautotune()
,contract()
, orgradients()
, but incurs a small overhead due to obtaining and releasing workspace memory from and to the package memory pool on every call. The default isFalse
.
- Returns
The result is of the same type and on the same device as the operands.
- contract_path(optimize=None)[source]¶
Compute the best contraction path together with any slicing that is needed to ensure that the contraction can be performed within the specified memory limit.
- Parameters
optimize – This parameter specifies options for path optimization as an
OptimizerOptions
object. Alternatively, a dictionary containing the parameters for theOptimizerOptions
constructor can also be provided. If not specified, the value will be set to the default-constructedOptimizerOptions
object.- Returns
A 2-tuple (
path
,opt_info
):path
: A sequence of pairs of operand ordinals representing the best contraction order in thenumpy.einsum_path()
format.opt_info
: An object of typeOptimizerInfo
containing information about the best contraction order.
- Return type
Notes
If the path is provided, the user has to set the sliced modes too if slicing is desired.
- free()[source]¶
Free network resources.
It is recommended that the
Network
object be used within a context, but if it is not possible then this method must be called explicitly to ensure that the network resources are properly cleaned up.
- gradients(output_gradient, *, stream=None, release_workspace=False)[source]¶
Compute the gradients of the network (w.r.t. the input operands whose gradients are required).
Before calling this method, a full contraction must have been performed (by calling
contract()
), otherwise an error is raised.- Parameters
output_gradient – A tensor of the same package (NumPy/CuPy/PyTorch), shape, dtype, strides, and location (CPU/GPU) as the contraction output (as returned by
contract()
), which in turn shares the same properties with the input operands. In a chain-rule setting,output_gradient
is the gradient w.r.t. the output tensor.stream – Provide the CUDA stream to use for the gradient computation. Acceptable inputs include
cudaStream_t
(as Pythonint
),cupy.cuda.Stream
, andtorch.cuda.Stream
. If a stream is not provided, the current stream will be used.release_workspace – A value of
True
specifies that theNetwork
object should release workspace memory back to the package memory pool on function return, while a value ofFalse
specifies that theNetwork
object should retain the memory. This option may be set toTrue
if the application performs other operations that consume a lot of memory between successive calls to the (same or different) execution API such asautotune()
,contract()
, orgradients()
, but incurs a small overhead due to obtaining and releasing workspace memory from and to the package memory pool on every call. The default isFalse
.
- Returns
A sequence of gradient tensors. The result is of the same length and type and on the same device as the input operands. For the gradient components that are not requested,
None
is returned.
Note
For PyTorch operands, calling this method is not tracked by the autograd graph.
Warning
This API is experimental and subject to future changes.
- reset_operands(*operands, stream=None)[source]¶
Reset the operands held by this
Network
instance.- This method has two use cases: (1) it can be used to provide new operands for execution when the
original operands are on the CPU, or (2) it can be used to release the internal reference to the previous operands and make their memory available for other use by passing
None
for theoperands
argument. In this case, this method must be called again to provide the desired operands before another call to execution APIs likeautotune()
,contract()
, orgradients()
.
This method is not needed when the operands reside on the GPU and in-place operations are used to update the operand values.
This method will perform various checks on the new operands to make sure:
The shapes, strides, datatypes match those of the old ones.
The packages that the operands belong to match those of the old ones.
If input tensors are on GPU, the library package and device must match.
- Parameters
operands – See
Network
’s documentation.stream – Provide the CUDA stream to use for resetting operands (this is used to copy the operands to the GPU if they are provided on the CPU). Acceptable inputs include
cudaStream_t
(as Pythonint
),cupy.cuda.Stream
, andtorch.cuda.Stream
. If a stream is not provided, the current stream will be used.