cuTENSOR Data Types¶

`cutensorComputeType_t`¶

enum cutensorComputeType_t¶

Brief: Encodes cuTENSOR’s compute type (see “User Guide - Accuracy Guarantees” for details).

Values:

enumerator CUTENSOR_COMPUTE_16F = (1U << 0U)¶: floating-point: 5-bit exponent and 10-bit mantissa (aka half)

enumerator CUTENSOR_COMPUTE_16BF = (1U << 10U)¶: floating-point: 8-bit exponent and 7-bit mantissa (aka bfloat)

enumerator CUTENSOR_COMPUTE_TF32 = (1U << 12U)¶: floating-point: 8-bit exponent and 10-bit mantissa (aka tensor-float-32)

enumerator CUTENSOR_COMPUTE_32F = (1U << 2U)¶: floating-point: 8-bit exponent and 23-bit mantissa (aka float)

enumerator CUTENSOR_COMPUTE_64F = (1U << 4U)¶: floating-point: 11-bit exponent and 52-bit mantissa (aka double)

enumerator CUTENSOR_COMPUTE_8U = (1U << 6U)¶: 8-bit unsigned integer

enumerator CUTENSOR_COMPUTE_8I = (1U << 8U)¶: 8-bit signed integer

enumerator CUTENSOR_COMPUTE_32U = (1U << 7U)¶: 32-bit unsigned integer

enumerator CUTENSOR_COMPUTE_32I = (1U << 9U)¶: 32-bit signed integer

enumerator CUTENSOR_R_MIN_16F = (1U << 0U)¶: DEPRECATED (real as a half), please use CUTENSOR_COMPUTE_16F instead.

enumerator CUTENSOR_C_MIN_16F = (1U << 1U)¶: DEPRECATED (complex as a half), please use CUTENSOR_COMPUTE_16F instead.

enumerator CUTENSOR_R_MIN_32F = (1U << 2U)¶: DEPRECATED (real as a float), please use CUTENSOR_COMPUTE_32F instead.

enumerator CUTENSOR_C_MIN_32F = (1U << 3U)¶: DEPRECATED (complex as a float), please use CUTENSOR_COMPUTE_32F instead.

enumerator CUTENSOR_R_MIN_64F = (1U << 4U)¶: DEPRECATED (real as a double), please use CUTENSOR_COMPUTE_64F instead.

enumerator CUTENSOR_C_MIN_64F = (1U << 5U)¶: DEPRECATED (complex as a double), please use CUTENSOR_COMPUTE_64F instead.

enumerator CUTENSOR_R_MIN_8U = (1U << 6U)¶: DEPRECATED (real as a uint8), please use CUTENSOR_COMPUTE_8U instead.

enumerator CUTENSOR_R_MIN_32U = (1U << 7U)¶: DEPRECATED (real as a uint32), please use CUTENSOR_COMPUTE_32U instead.

enumerator CUTENSOR_R_MIN_8I = (1U << 8U)¶: DEPRECATED (real as a int8), please use CUTENSOR_COMPUTE_8I instead.

enumerator CUTENSOR_R_MIN_32I = (1U << 9U)¶: DEPRECATED (real as a int32), please use CUTENSOR_COMPUTE_32I instead.

enumerator CUTENSOR_R_MIN_16BF = (1U << 10U)¶: DEPRECATED (real as a bfloat16), please use CUTENSOR_COMPUTE_16BF instead.

enumerator CUTENSOR_R_MIN_TF32 = (1U << 11U)¶: DEPRECATED (real as a tensorfloat32), please use CUTENSOR_COMPUTE_TF32 instead.

enumerator CUTENSOR_C_MIN_TF32 = (1U << 12U)¶: DEPRECATED (complex as a tensorfloat32), please use CUTENSOR_COMPUTE_TF32 instead.

`cutensorHandle_t`¶

struct cutensorHandle_t¶

Brief: Opaque structure holding cuTENSOR’s library context.

`cutensorTensorDescriptor_t`¶

struct cutensorTensorDescriptor_t¶

Brief: Opaque structure representing a tensor descriptor.

`cutensorContractionDescriptor_t`¶

struct cutensorContractionDescriptor_t¶

Brief: Opaque structure representing a tensor contraction descriptor.

`cutensorContractionDescriptorAttributes_t`¶

enum cutensorContractionDescriptorAttributes_t¶

This enum lists all attributes of a cutensorContractionContraction_t that can be modified.

Values:

enumerator CUTENSOR_CONTRACTION_DESCRIPTOR_TAG¶: uint32_t: enables users to distinguish two identical tensor contractions w.r.t. the sw-managed plan-cache. (default value: 0)

`cutensorContractionFind_t`¶

struct cutensorContractionFind_t¶

Brief: Opaque structure representing a candidate.

`cutensorContractionFindAttributes_t`¶

enum cutensorContractionFindAttributes_t¶

This enum lists all attributes of a cutensorContractionFind_t that can be modified.

Values:

enumerator CUTENSOR_CONTRACTION_FIND_AUTOTUNE_MODE¶: cutensorAutotuneMode_t: Determines if the corresponding algrithm/kernel for this plan should be cached.

enumerator CUTENSOR_CONTRACTION_FIND_CACHE_MODE¶: cutensorCacheMode_t: Gives fine control over what is considered a cachehit.

enumerator CUTENSOR_CONTRACTION_FIND_INCREMENTAL_COUNT¶: uint32_t: Only applicable if CUTENSOR_CONTRACTION_FIND_CACHE_MODE is set to CUTENSOR_AUTOTUNE_INCREMENTAL

`cutensorContractionPlan_t`¶

struct cutensorContractionPlan_t¶

Brief: Opaque structure representing a plan.

`cutensorAutotuneMode_t`¶

enum cutensorAutotuneMode_t¶

This enum is important w.r.t. cuTENSOR’s caching capability of plans.

Values:

enumerator CUTENSOR_AUTOTUNE_NONE¶: Indicates no autotuning (default); in this case the cache will help to reduce the plan-creation overhead. In the case of a cachehit: the cached plan will be reused, otherwise the plancache will be neglected.

enumerator CUTENSOR_AUTOTUNE_INCREMENTAL¶: Indicates an incremental autotuning (i.e., each invocation of corresponding cutensorInitContractionPlan() will create a plan based on a different algorithm/kernel; the maximum number of kernels that will be tested is defined by the CUTENSOR_CONTRACTION_FIND_INCREMENTAL_COUNT FindAttributes_t). WARNING: If this autotuning mode is selected, then we cannot guarantee bit-wise identical results (since different algorithms could be executed).

`cutensorCacheMode_t`¶

enum cutensorCacheMode_t¶

This enum defines what is considered a cache hit.

Values:

enumerator CUTENSOR_CACHE_MODE_NONE¶: Plan will not be cached.

enumerator CUTENSOR_CACHE_MODE_PEDANTIC¶: All parameters of the corresponding descriptor must be identical to the cached plan (default).

`cutensorAlgo_t`¶

enum cutensorAlgo_t¶

Brief: Allows users to specify the algorithm to be used for performing the tensor contraction.
Details: This enum gives users finer control over which algorithm should be executed by cutensorContraction(); values >= 0 correspond to certain sub-algorithms of GETT.

Values:

enumerator CUTENSOR_ALGO_GETT = -4¶: Choose the GETT algorithm.

enumerator CUTENSOR_ALGO_TGETT = -3¶: Transpose (A or B) + GETT.

enumerator CUTENSOR_ALGO_TTGT = -2¶: Transpose-Transpose-GEMM-Transpose (requires additional memory)

enumerator CUTENSOR_ALGO_DEFAULT = -1¶: Lets the internal heuristic choose.

`cutensorWorksizePreference_t`¶

enum cutensorWorksizePreference_t¶

Brief: This enum gives users finer control over the suggested workspace
Details: This enum gives users finer control over the amount of workspace that is suggested by cutensorContractionGetWorkspace

Values:

enumerator CUTENSOR_WORKSPACE_MIN = 1¶: At least one algorithm will be available.

enumerator CUTENSOR_WORKSPACE_RECOMMENDED = 2¶: The most suitable algorithm will be available.

enumerator CUTENSOR_WORKSPACE_MAX = 3¶: All algorithms will be available.

`cutensorOperator_t`¶

enum cutensorOperator_t¶

Brief: This enum captures all unary and binary element-wise operations supported by the cuTENSOR library.

Values:

enumerator CUTENSOR_OP_IDENTITY = 1¶: Identity operator (i.e., elements are not changed)

enumerator CUTENSOR_OP_SQRT = 2¶: Square root.

enumerator CUTENSOR_OP_RELU = 8¶: Rectified linear unit.

enumerator CUTENSOR_OP_CONJ = 9¶: Complex conjugate.

enumerator CUTENSOR_OP_RCP = 10¶: Reciprocal.

enumerator CUTENSOR_OP_SIGMOID = 11¶: y=1/(1+exp(-x))

enumerator CUTENSOR_OP_TANH = 12¶: y=tanh(x)

enumerator CUTENSOR_OP_EXP = 22¶: Exponentiation.

enumerator CUTENSOR_OP_LOG = 23¶: Log (base e).

enumerator CUTENSOR_OP_ABS = 24¶: Absolute value.

enumerator CUTENSOR_OP_NEG = 25¶: Negation.

enumerator CUTENSOR_OP_SIN = 26¶: Sine.

enumerator CUTENSOR_OP_COS = 27¶: Cosine.

enumerator CUTENSOR_OP_TAN = 28¶: Tangent.

enumerator CUTENSOR_OP_SINH = 29¶: Hyperbolic sine.

enumerator CUTENSOR_OP_COSH = 30¶: Hyperbolic cosine.

enumerator CUTENSOR_OP_ASIN = 31¶: Inverse sine.

enumerator CUTENSOR_OP_ACOS = 32¶: Inverse cosine.

enumerator CUTENSOR_OP_ATAN = 33¶: Inverse tangent.

enumerator CUTENSOR_OP_ASINH = 34¶: Inverse hyperbolic sine.

enumerator CUTENSOR_OP_ACOSH = 35¶: Inverse hyperbolic cosine.

enumerator CUTENSOR_OP_ATANH = 36¶: Inverse hyperbolic tangent.

enumerator CUTENSOR_OP_CEIL = 37¶: Ceiling.

enumerator CUTENSOR_OP_FLOOR = 38¶: Floor.

enumerator CUTENSOR_OP_ADD = 3¶: Addition of two elements.

enumerator CUTENSOR_OP_MUL = 5¶: Multiplication of two elements.

enumerator CUTENSOR_OP_MAX = 6¶: Maximum of two elements.

enumerator CUTENSOR_OP_MIN = 7¶: Minimum of two elements.

enumerator CUTENSOR_OP_UNKNOWN = 126¶: reserved for internal use only

`cutensorStatus_t`¶

enum cutensorStatus_t¶

Brief: cuTENSOR status type returns
Details: The type is used for function status returns. All cuTENSOR library functions return their status, which can have the following values.

Values:

enumerator CUTENSOR_STATUS_SUCCESS = 0¶: The operation completed successfully.

enumerator CUTENSOR_STATUS_NOT_INITIALIZED = 1¶: The cuTENSOR library was not initialized.

enumerator CUTENSOR_STATUS_ALLOC_FAILED = 3¶: Resource allocation failed inside the cuTENSOR library.

enumerator CUTENSOR_STATUS_INVALID_VALUE = 7¶: An unsupported value or parameter was passed to the function (indicates an user error).

enumerator CUTENSOR_STATUS_ARCH_MISMATCH = 8¶: Indicates that the device is either not ready, or the target architecture is not supported.

enumerator CUTENSOR_STATUS_MAPPING_ERROR = 11¶: An access to GPU memory space failed, which is usually caused by a failure to bind a texture.

enumerator CUTENSOR_STATUS_EXECUTION_FAILED = 13¶: The GPU program failed to execute. This is often caused by a launch failure of the kernel on the GPU, which can be caused by multiple reasons.

enumerator CUTENSOR_STATUS_INTERNAL_ERROR = 14¶: An internal cuTENSOR error has occurred.

enumerator CUTENSOR_STATUS_NOT_SUPPORTED = 15¶: The requested operation is not supported.

enumerator CUTENSOR_STATUS_LICENSE_ERROR = 16¶: The functionality requested requires some license and an error was detected when trying to check the current licensing.

enumerator CUTENSOR_STATUS_CUBLAS_ERROR = 17¶: A call to CUBLAS did not succeed.

enumerator CUTENSOR_STATUS_CUDA_ERROR = 18¶: Some unknown CUDA error has occurred.

enumerator CUTENSOR_STATUS_INSUFFICIENT_WORKSPACE = 19¶: The provided workspace was insufficient.

enumerator CUTENSOR_STATUS_INSUFFICIENT_DRIVER = 20¶: Indicates that the driver version is insufficient.

enumerator CUTENSOR_STATUS_IO_ERROR = 21¶: Indicates an error related to file I/O.

cuTENSOR Data Types¶

cutensorComputeType_t¶

cutensorHandle_t¶

cutensorTensorDescriptor_t¶

cutensorContractionDescriptor_t¶

cutensorContractionDescriptorAttributes_t¶

cutensorContractionFind_t¶

cutensorContractionFindAttributes_t¶

cutensorContractionPlan_t¶

cutensorAutotuneMode_t¶

cutensorCacheMode_t¶

cutensorAlgo_t¶

cutensorWorksizePreference_t¶

cutensorOperator_t¶

cutensorStatus_t¶

`cutensorComputeType_t`¶

`cutensorHandle_t`¶

`cutensorTensorDescriptor_t`¶

`cutensorContractionDescriptor_t`¶

`cutensorContractionDescriptorAttributes_t`¶

`cutensorContractionFind_t`¶

`cutensorContractionFindAttributes_t`¶

`cutensorContractionPlan_t`¶

`cutensorAutotuneMode_t`¶

`cutensorCacheMode_t`¶

`cutensorAlgo_t`¶

`cutensorWorksizePreference_t`¶

`cutensorOperator_t`¶

`cutensorStatus_t`¶