cuquantum.cutensornet.contraction_autotune

cuquantum.cutensornet.contraction_autotune(intptr_t handle, intptr_t plan, raw_data_in, intptr_t raw_data_out, intptr_t work_desc, intptr_t pref, intptr_t stream)[source]

Auto-tunes the contraction plan to find the best cutensorContractionPlan_t for each pair-wise contraction.

Parameters
  • handle (intptr_t) – Opaque handle holding cuTensorNet’s library context.

  • plan (intptr_t) – The plan must already be created (see create_contraction_plan()); the individual contraction plans will be fine-tuned.

  • raw_data_in (object) –

    Array of N pointers (N being the number of input tensors specified create_network_descriptor()); raw_data_in[i] points to the data associated with the i-th input tensor (in device memory). It can be:

    • an int as the pointer address to the array, or

    • a Python sequence of ints (as pointer addresses).

  • raw_data_out (intptr_t) – Points to the raw data of the output tensor (in device memory).

  • work_desc (intptr_t) – Opaque structure describing the workspace. The provided workspace must be valid (the workspace size must be the same as or larger than both the minimum needed and the value provided at plan creation). See create_contraction_plan(), workspace_get_memory_size() & workspace_set_memory(). If a device memory handler is set, the work_desc can be set to null, or the workspace pointer in work_desc can be set to null, and the workspace size can be set either to 0 (in which case the “recommended” size is used, see CUTENSORNET_WORKSIZE_PREF_RECOMMENDED) or to a valid size. A workspace of the specified size will be drawn from the user’s mempool and released back once done.

  • pref (intptr_t) – Controls the auto-tuning process and gives the user control over how much time is spent in this routine.

  • stream (intptr_t) – The CUDA stream on which the computation is performed.