cuquantum.Network¶

class cuquantum.Network(subscripts, *operands, qualifiers=None, options=None, stream=None)[source]¶

Create a tensor network object specified as an Einstein summation expression.

The Einstein summation convention provides an elegant way of representing many tensor network operations. This object allows the user to invest considerable effort into computing the best contraction path as well as autotuning the contraction upfront for repeated contractions over the same network topology (different input tensors, or “operands”, with the same Einstein summation expression). Also see contract_path() and autotune().

For the Einstein summation expression, both the explicit and implicit forms are supported.

In the implicit form, the output mode labels are inferred from the summation expression and reordered lexicographically. An example is the expression 'ij,jh', for which the output mode labels are 'hi'. (This corresponds to a matrix multiplication followed by a transpose.)

In the explicit form, output mode labels can be directly stated following the identifier '->' in the summation expression. An example is the expression 'ij,jh->ih' (which corresponds to a matrix multiplication).

To specify an Einstein summation expression, both the subscript format (as shown above) and the interleaved format are supported.

The interleaved format is an alternative way for specifying the operands and their mode labels as Network(op0, modes0, op1, modes1, ..., [modes_out]), where opN is the N-th operand and modesN is a sequence of hashable and comparable objects (strings, integers, etc) representing the N-th operand’s mode labels.

Ellipsis broadcasting is supported.

Additional information on various operations on the network can be obtained by passing in a logging.Logger object to NetworkOptions or by setting the appropriate options in the root logger object, which is used by default:

>>> import logging
>>> logging.basicConfig(level=logging.INFO, format='%(asctime)s %(levelname)-8s %(message)s', datefmt='%m-%d %H:%M:%S')

Parameters

subscripts – The mode labels (subscripts) defining the Einstein summation expression as a comma-separated sequence of characters. Unicode characters are allowed in the expression thereby expanding the size of the tensor network that can be specified using the Einstein summation convention.
operands – A sequence of tensors (ndarray-like objects). The currently supported types are numpy.ndarray, cupy.ndarray, and torch.Tensor.
qualifiers – Specify the tensor qualifiers as a numpy.ndarray of tensor_qualifiers_dtype objects of length equal to the number of operands.
options – Specify options for the tensor network as a NetworkOptions object. Alternatively, a dict containing the parameters for the NetworkOptions constructor can also be provided. If not specified, the value will be set to the default-constructed NetworkOptions object.
stream – Provide the CUDA stream to use for network construction, which is needed for stream-ordered operations such as allocating memory. Acceptable inputs include cudaStream_t (as Python int), cupy.cuda.Stream, and torch.cuda.Stream. If a stream is not provided, the current stream will be used.

Examples

>>> from cuquantum import Network
>>> import numpy as np

Define the parameters of the tensor network:

>>> expr = 'ehl,gj,edhg,bif,d,c,k,iklj,cf,a->ba'
>>> shapes = [(8, 2, 5), (5, 7), (8, 8, 2, 5), (8, 6, 3), (8,), (6,), (5,), (6, 5, 5, 7), (6, 3), (3,)]

Create the input tensors using NumPy:

>>> operands = [np.random.rand(*shape) for shape in shapes]

Create a Network object:

>>> tn = Network(expr, *operands)

Find the best contraction order:

>>> path, info = tn.contract_path({'samples': 500})

Autotune the network:

>>> tn.autotune(iterations=5)

Perform the contraction. The result is of the same type and on the same device as the operands:

>>> r1 = tn.contract()

Reset operands to new values:

>>> operands = [i*operand for i, operand in enumerate(operands, start=1)]
>>> tn.reset_operands(*operands)

Get the result of the new contraction:

>>> r2 = tn.contract()
>>> from math import factorial
>>> np.allclose(r2, factorial(len(operands))*r1)
True

Finally, free network resources. If this call isn’t made, it may hinder further operations (especially if the network is large) since the memory will be released only when the object goes out of scope. (To avoid having to explicitly make this call, it is recommended to use the Network object as a context manager.)

>>> tn.free()

If the operands are on the GPU, they can also be updated using in-place operations. In this case, the call to reset_operands() can be skipped – subsequent contract() calls will use the same operands (with updated contents). The following example illustrates this using CuPy operands and also demonstrates the usage of a Network context (so as to skip calling free()):

>>> import cupy as cp
>>> expr = 'ehl,gj,edhg,bif,d,c,k,iklj,cf,a->ba'
>>> shapes = [(8, 2, 5), (5, 7), (8, 8, 2, 5), (8, 6, 3), (8,), (6,), (5,), (6, 5, 5, 7), (6, 3), (3,)]
>>> operands = [cp.random.rand(*shape) for shape in shapes]
>>>
>>> with Network(expr, *operands) as tn:
...     path, info = tn.contract_path({'samples': 500})
...     tn.autotune(iterations=5)
...
...     # Perform the contraction
...     r1 = tn.contract()
...
...     # Update the operands in place
...     for i, operand in enumerate(operands, start=1):
...         operand *= i
...
...     # Perform the contraction with the updated operand values
...     r2 = tn.contract()
...
... # The resources used by the network are automatically released when the context ends.
>>>
>>> from math import factorial
>>> cp.allclose(r2, factorial(len(operands))*r1)
array(True)

PyTorch CPU and GPU tensors can be passed as input operands in the same fashion.

To compute the gradients of the network w.r.t. the input operands (NumPy/CuPy/PyTorch), the gradients() method can be used. To enable the gradient computation, one should

create the network with the qualifiers argument

call the contract() method prior to the gradients() method

seed the gradients() method with the output gradient (see the docs for the requirements)

Below is a minimal example:

>>> from cuquantum import cutensornet as cutn
>>> expr = "ijk,jkl,klm,lmn"
>>> shapes = ((3, 4, 5), (4, 5, 3), (5, 3, 2), (3, 2, 6))
>>> operands = [cp.random.rand(*shape) for shape in shapes]
>>> qualifiers = np.zeros(len(shapes), dtype=cutn.tensor_qualifiers_dtype)
>>> qualifiers[:]["requires_gradient"] = 1  # request gradients for all input tensors
>>>
>>> with Network(expr, *operands, qualifiers=qualifiers) as tn:
...     path, info = tn.contract_path()
...
...     # Perform the contraction
...     r = tn.contract()
...
...     # Perform the backprop
...     input_grads = tn.gradients(cp.ones_like(r))
...
>>>

For PyTorch CPU/GPU tensors with the requires_grad attribute set up, one does not need to pass the qualifiers argument. Note that this Network class and its methods are not PyTorch operators and do not add any node to PyTorch’s autograd graph. For a native, differentiable PyTorch operator, use the cuquantum.contract() function.

See contract() for more examples on specifying the Einstein summation expression as well as specifying options for the tensor network and the optimizer.

Methods

__init__(subscripts, *operands, qualifiers=None, options=None, stream=None)[source]¶

autotune(*, iterations=3, stream=None, release_workspace=False)[source]¶

Autotune the network to reduce the contraction cost.

This is an optional step that is recommended if the Network object is used to perform multiple contractions.

Parameters

iterations – The number of iterations for autotuning. See CUTENSORNET_CONTRACTION_AUTOTUNE_MAX_ITERATIONS.
stream – Provide the CUDA stream to use for the autotuning operation. Acceptable inputs include cudaStream_t (as Python int), cupy.cuda.Stream, and torch.cuda.Stream. If a stream is not provided, the current stream will be used.
release_workspace – A value of True specifies that the Network object should release workspace memory back to the package memory pool on function return, while a value of False specifies that the Network object should retain the memory. This option may be set to True if the application performs other operations that consume a lot of memory between successive calls to the (same or different) execution API such as autotune(), contract(), or gradients(), but incurs a small overhead due to obtaining and releasing workspace memory from and to the package memory pool on every call. The default is False.

contract(*, slices=None, stream=None, release_workspace=False)[source]¶

Contract the network and return the result.

Parameters

slices – Specify the slices to be contracted as Python range for contiguous slice IDs or as a Python sequence object for arbitrary slice IDs. If not specified, all slices will be contracted.
stream – Provide the CUDA stream to use for the contraction operation. Acceptable inputs include cudaStream_t (as Python int), cupy.cuda.Stream, and torch.cuda.Stream. If a stream is not provided, the current stream will be used.
release_workspace – A value of True specifies that the Network object should release workspace memory back to the package memory pool on function return, while a value of False specifies that the Network object should retain the memory. This option may be set to True if the application performs other operations that consume a lot of memory between successive calls to the (same or different) execution API such as autotune(), contract(), or gradients(), but incurs a small overhead due to obtaining and releasing workspace memory from and to the package memory pool on every call. The default is False.

Returns

The result is of the same type and on the same device as the operands.

contract_path(optimize=None)[source]¶

Compute the best contraction path together with any slicing that is needed to ensure that the contraction can be performed within the specified memory limit.

Parameters

optimize – This parameter specifies options for path optimization as an OptimizerOptions object. Alternatively, a dictionary containing the parameters for the OptimizerOptions constructor can also be provided. If not specified, the value will be set to the default-constructed OptimizerOptions object.

Returns

A 2-tuple (path, opt_info):

path : A sequence of pairs of operand ordinals representing the best contraction order in the numpy.einsum_path() format.

opt_info : An object of type OptimizerInfo containing information about the best contraction order.

Return type

tuple

Notes

If the path is provided, the user has to set the sliced modes too if slicing is desired.

free()[source]¶

Free network resources.

It is recommended that the Network object be used within a context, but if it is not possible then this method must be called explicitly to ensure that the network resources are properly cleaned up.

gradients(output_gradient, *, stream=None, release_workspace=False)[source]¶

Compute the gradients of the network (w.r.t. the input operands whose gradients are required).

Before calling this method, a full contraction must have been performed (by calling contract()), otherwise an error is raised.

Parameters

output_gradient – A tensor of the same package (NumPy/CuPy/PyTorch), shape, dtype, strides, and location (CPU/GPU) as the contraction output (as returned by contract()), which in turn shares the same properties with the input operands. In a chain-rule setting, output_gradient is the gradient w.r.t. the output tensor.
stream – Provide the CUDA stream to use for the gradient computation. Acceptable inputs include cudaStream_t (as Python int), cupy.cuda.Stream, and torch.cuda.Stream. If a stream is not provided, the current stream will be used.
release_workspace – A value of True specifies that the Network object should release workspace memory back to the package memory pool on function return, while a value of False specifies that the Network object should retain the memory. This option may be set to True if the application performs other operations that consume a lot of memory between successive calls to the (same or different) execution API such as autotune(), contract(), or gradients(), but incurs a small overhead due to obtaining and releasing workspace memory from and to the package memory pool on every call. The default is False.

Returns

A sequence of gradient tensors. The result is of the same length and type and on the same device as the input operands. For the gradient components that are not requested, None is returned.

Note

For PyTorch operands, calling this method is not tracked by the autograd graph.

Warning

This API is experimental and subject to future changes.

reset_operands(*operands, stream=None)[source]¶

Reset the operands held by this Network instance.

This method has two use cases: (1) it can be used to provide new operands for execution when the: original operands are on the CPU, or (2) it can be used to release the internal reference to the previous operands and make their memory available for other use by passing None for the operands argument. In this case, this method must be called again to provide the desired operands before another call to execution APIs like autotune(), contract(), or gradients().

This method is not needed when the operands reside on the GPU and in-place operations are used to update the operand values.

This method will perform various checks on the new operands to make sure:

The shapes, strides, datatypes match those of the old ones.

The packages that the operands belong to match those of the old ones.

If input tensors are on GPU, the library package and device must match.

Parameters

operands – See Network’s documentation.
stream – Provide the CUDA stream to use for resetting operands (this is used to copy the operands to the GPU if they are provided on the CPU). Acceptable inputs include cudaStream_t (as Python int), cupy.cuda.Stream, and torch.cuda.Stream. If a stream is not provided, the current stream will be used.