cuquantum.cutensornet.compute_gradients_backward

cuquantum.cutensornet.compute_gradients_backward(intptr_t handle, intptr_t plan, raw_data_in, intptr_t output_gradient, gradients, int32_t accumulate_output, intptr_t work_desc, intptr_t stream)[source]

Computes the gradients of the network w.r.t. the input tensors whose gradients are required. The network must have been contracted and loaded in the work_desc CACHE. Operates only on networks with single slice and no singleton modes.

Parameters
  • handle (intptr_t) – Opaque handle holding cuTensorNet’s library context.

  • plan (intptr_t) – Encodes the execution of a tensor network contraction (see create_contraction_plan() and contraction_autotune()). Some internal meta-data may be updated upon contraction.

  • raw_data_in (object) –

    Array of N pointers (N being the number of input tensors specified in create_network_descriptor()): raw_data_in[i] points to the data associated with the i-th input tensor (in device memory). It can be:

    • an int as the pointer address to the array, or

    • a Python sequence of ints (as pointer addresses).

  • output_gradient (intptr_t) – Gradient of the output tensor (in device memory). Must have the same memory layout (strides) as the output tensor of the tensor network.

  • gradients (object) –

    Array of N pointers: gradients[i] points to the gradient data associated with the i-th input tensor in device memory. Setting gradients[i] to null would skip computing the gradient of the i-th input tensor. Generated gradient data has the same memory layout (strides) as their corresponding input tensors. It can be:

    • an int as the pointer address to the array, or

    • a Python sequence of ints (as pointer addresses).

  • accumulate_output (int32_t) – If 0, write the gradient results into gradients; otherwise accumulates the results into gradients.

  • work_desc (intptr_t) – Opaque structure describing the workspace. The provided CUTENSORNET_WORKSPACE_SCRATCH workspace must be valid (the workspace size must be the same as or larger than the minimum needed). See workspace_compute_contraction_sizes(), workspace_get_memory_size() & workspace_set_memory(). The provided CUTENSORNET_WORKSPACE_CACHE workspace must be valid, and contains the cached intermediate tensors from the corresponding contract_slices() call. If a device memory handler is set, and work_desc is set to null, or the memory pointer in work_desc of either the workspace kinds is set to null, for both calls to contract_slices() and compute_gradients_backward(), memory will be drawn from the memory pool. See contract_slices() for details.

  • stream (intptr_t) – The CUDA stream on which the computation is performed.