cuTENSOR Data Types¶
cutensorComputeType_t¶
-
enum
cutensorComputeType_t¶
-
- Brief
-
Encodes cuTENSOR’s compute type (see “User Guide - Accuracy Guarantees” for details).
Values:
-
enumerator
CUTENSOR_COMPUTE_16F= (1U << 0U)¶
-
floating-point: 5-bit exponent and 10-bit mantissa (aka half)
-
enumerator
CUTENSOR_COMPUTE_16BF= (1U << 10U)¶
-
floating-point: 8-bit exponent and 7-bit mantissa (aka bfloat)
-
enumerator
CUTENSOR_COMPUTE_TF32= (1U << 12U)¶
-
floating-point: 8-bit exponent and 10-bit mantissa (aka tensor-float-32)
-
enumerator
CUTENSOR_COMPUTE_32F= (1U << 2U)¶
-
floating-point: 8-bit exponent and 23-bit mantissa (aka float)
-
enumerator
CUTENSOR_COMPUTE_64F= (1U << 4U)¶
-
floating-point: 11-bit exponent and 52-bit mantissa (aka double)
-
enumerator
CUTENSOR_COMPUTE_8U= (1U << 6U)¶
-
8-bit unsigned integer
-
enumerator
CUTENSOR_COMPUTE_8I= (1U << 8U)¶
-
8-bit signed integer
-
enumerator
CUTENSOR_COMPUTE_32U= (1U << 7U)¶
-
32-bit unsigned integer
-
enumerator
CUTENSOR_COMPUTE_32I= (1U << 9U)¶
-
32-bit signed integer
-
enumerator
CUTENSOR_R_MIN_16F= (1U << 0U)¶
-
DEPRECATED (real as a half), please use CUTENSOR_COMPUTE_16F instead.
-
enumerator
CUTENSOR_C_MIN_16F= (1U << 1U)¶
-
DEPRECATED (complex as a half), please use CUTENSOR_COMPUTE_16F instead.
-
enumerator
CUTENSOR_R_MIN_32F= (1U << 2U)¶
-
DEPRECATED (real as a float), please use CUTENSOR_COMPUTE_32F instead.
-
enumerator
CUTENSOR_C_MIN_32F= (1U << 3U)¶
-
DEPRECATED (complex as a float), please use CUTENSOR_COMPUTE_32F instead.
-
enumerator
CUTENSOR_R_MIN_64F= (1U << 4U)¶
-
DEPRECATED (real as a double), please use CUTENSOR_COMPUTE_64F instead.
-
enumerator
CUTENSOR_C_MIN_64F= (1U << 5U)¶
-
DEPRECATED (complex as a double), please use CUTENSOR_COMPUTE_64F instead.
-
enumerator
CUTENSOR_R_MIN_8U= (1U << 6U)¶
-
DEPRECATED (real as a uint8), please use CUTENSOR_COMPUTE_8U instead.
-
enumerator
CUTENSOR_R_MIN_32U= (1U << 7U)¶
-
DEPRECATED (real as a uint32), please use CUTENSOR_COMPUTE_32U instead.
-
enumerator
CUTENSOR_R_MIN_8I= (1U << 8U)¶
-
DEPRECATED (real as a int8), please use CUTENSOR_COMPUTE_8I instead.
-
enumerator
CUTENSOR_R_MIN_32I= (1U << 9U)¶
-
DEPRECATED (real as a int32), please use CUTENSOR_COMPUTE_32I instead.
-
enumerator
CUTENSOR_R_MIN_16BF= (1U << 10U)¶
-
DEPRECATED (real as a bfloat16), please use CUTENSOR_COMPUTE_16BF instead.
-
enumerator
CUTENSOR_R_MIN_TF32= (1U << 11U)¶
-
DEPRECATED (real as a tensorfloat32), please use CUTENSOR_COMPUTE_TF32 instead.
-
enumerator
CUTENSOR_C_MIN_TF32= (1U << 12U)¶
-
DEPRECATED (complex as a tensorfloat32), please use CUTENSOR_COMPUTE_TF32 instead.
cutensorHandle_t¶
-
struct
cutensorHandle_t¶
-
- Brief
-
Opaque structure holding cuTENSOR’s library context.
cutensorTensorDescriptor_t¶
-
struct
cutensorTensorDescriptor_t¶
-
- Brief
-
Opaque structure representing a tensor descriptor.
cutensorContractionDescriptor_t¶
-
struct
cutensorContractionDescriptor_t¶
-
- Brief
-
Opaque structure representing a tensor contraction descriptor.
cutensorContractionDescriptorAttributes_t¶
-
enum
cutensorContractionDescriptorAttributes_t¶
-
This enum lists all attributes of a cutensorContractionContraction_t that can be modified.
Values:
-
enumerator
CUTENSOR_CONTRACTION_DESCRIPTOR_TAG¶
-
uint32_t: enables users to distinguish two identical tensor contractions w.r.t. the sw-managed plan-cache. (default value: 0)
-
enumerator
cutensorContractionFind_t¶
-
struct
cutensorContractionFind_t¶
-
- Brief
-
Opaque structure representing a candidate.
cutensorContractionFindAttributes_t¶
-
enum
cutensorContractionFindAttributes_t¶
-
This enum lists all attributes of a cutensorContractionFind_t that can be modified.
Values:
-
enumerator
CUTENSOR_CONTRACTION_FIND_AUTOTUNE_MODE¶
-
cutensorAutotuneMode_t: Determines if the corresponding algrithm/kernel for this plan should be cached.
-
enumerator
CUTENSOR_CONTRACTION_FIND_CACHE_MODE¶
-
cutensorCacheMode_t: Gives fine control over what is considered a cachehit.
-
enumerator
CUTENSOR_CONTRACTION_FIND_INCREMENTAL_COUNT¶
-
uint32_t: Only applicable if CUTENSOR_CONTRACTION_FIND_CACHE_MODE is set to CUTENSOR_AUTOTUNE_INCREMENTAL
-
enumerator
cutensorContractionPlan_t¶
-
struct
cutensorContractionPlan_t¶
-
- Brief
-
Opaque structure representing a plan.
cutensorAutotuneMode_t¶
-
enum
cutensorAutotuneMode_t¶
-
This enum is important w.r.t. cuTENSOR’s caching capability of plans.
Values:
-
enumerator
CUTENSOR_AUTOTUNE_NONE¶
-
Indicates no autotuning (default); in this case the cache will help to reduce the plan-creation overhead. In the case of a cachehit: the cached plan will be reused, otherwise the plancache will be neglected.
-
enumerator
CUTENSOR_AUTOTUNE_INCREMENTAL¶
-
Indicates an incremental autotuning (i.e., each invocation of corresponding cutensorInitContractionPlan() will create a plan based on a different algorithm/kernel; the maximum number of kernels that will be tested is defined by the CUTENSOR_CONTRACTION_FIND_INCREMENTAL_COUNT FindAttributes_t). WARNING: If this autotuning mode is selected, then we cannot guarantee bit-wise identical results (since different algorithms could be executed).
-
enumerator
cutensorCacheMode_t¶
-
enum
cutensorCacheMode_t¶
-
This enum defines what is considered a cache hit.
Values:
-
enumerator
CUTENSOR_CACHE_MODE_NONE¶
-
Plan will not be cached.
-
enumerator
CUTENSOR_CACHE_MODE_PEDANTIC¶
-
All parameters of the corresponding descriptor must be identical to the cached plan (default).
-
enumerator
cutensorAlgo_t¶
-
enum
cutensorAlgo_t¶
-
- Brief
-
Allows users to specify the algorithm to be used for performing the tensor contraction.
- Details
-
This enum gives users finer control over which algorithm should be executed by cutensorContraction(); values >= 0 correspond to certain sub-algorithms of GETT.
Values:
-
enumerator
CUTENSOR_ALGO_GETT= -4¶
-
Choose the GETT algorithm.
-
enumerator
CUTENSOR_ALGO_TGETT= -3¶
-
Transpose (A or B) + GETT.
-
enumerator
CUTENSOR_ALGO_TTGT= -2¶
-
Transpose-Transpose-GEMM-Transpose (requires additional memory)
-
enumerator
CUTENSOR_ALGO_DEFAULT= -1¶
-
Lets the internal heuristic choose.
cutensorWorksizePreference_t¶
-
enum
cutensorWorksizePreference_t¶
-
- Brief
-
This enum gives users finer control over the suggested workspace
- Details
-
This enum gives users finer control over the amount of workspace that is suggested by cutensorContractionGetWorkspace
Values:
-
enumerator
CUTENSOR_WORKSPACE_MIN= 1¶
-
At least one algorithm will be available.
-
enumerator
CUTENSOR_WORKSPACE_RECOMMENDED= 2¶
-
The most suitable algorithm will be available.
-
enumerator
CUTENSOR_WORKSPACE_MAX= 3¶
-
All algorithms will be available.
cutensorOperator_t¶
-
enum
cutensorOperator_t¶
-
- Brief
-
This enum captures all unary and binary element-wise operations supported by the cuTENSOR library.
Values:
-
enumerator
CUTENSOR_OP_IDENTITY= 1¶
-
Identity operator (i.e., elements are not changed)
-
enumerator
CUTENSOR_OP_SQRT= 2¶
-
Square root.
-
enumerator
CUTENSOR_OP_RELU= 8¶
-
Rectified linear unit.
-
enumerator
CUTENSOR_OP_CONJ= 9¶
-
Complex conjugate.
-
enumerator
CUTENSOR_OP_RCP= 10¶
-
Reciprocal.
-
enumerator
CUTENSOR_OP_SIGMOID= 11¶
-
y=1/(1+exp(-x))
-
enumerator
CUTENSOR_OP_TANH= 12¶
-
y=tanh(x)
-
enumerator
CUTENSOR_OP_EXP= 22¶
-
Exponentiation.
-
enumerator
CUTENSOR_OP_LOG= 23¶
-
Log (base e).
-
enumerator
CUTENSOR_OP_ABS= 24¶
-
Absolute value.
-
enumerator
CUTENSOR_OP_NEG= 25¶
-
Negation.
-
enumerator
CUTENSOR_OP_SIN= 26¶
-
Sine.
-
enumerator
CUTENSOR_OP_COS= 27¶
-
Cosine.
-
enumerator
CUTENSOR_OP_TAN= 28¶
-
Tangent.
-
enumerator
CUTENSOR_OP_SINH= 29¶
-
Hyperbolic sine.
-
enumerator
CUTENSOR_OP_COSH= 30¶
-
Hyperbolic cosine.
-
enumerator
CUTENSOR_OP_ASIN= 31¶
-
Inverse sine.
-
enumerator
CUTENSOR_OP_ACOS= 32¶
-
Inverse cosine.
-
enumerator
CUTENSOR_OP_ATAN= 33¶
-
Inverse tangent.
-
enumerator
CUTENSOR_OP_ASINH= 34¶
-
Inverse hyperbolic sine.
-
enumerator
CUTENSOR_OP_ACOSH= 35¶
-
Inverse hyperbolic cosine.
-
enumerator
CUTENSOR_OP_ATANH= 36¶
-
Inverse hyperbolic tangent.
-
enumerator
CUTENSOR_OP_CEIL= 37¶
-
Ceiling.
-
enumerator
CUTENSOR_OP_FLOOR= 38¶
-
Floor.
-
enumerator
CUTENSOR_OP_ADD= 3¶
-
Addition of two elements.
-
enumerator
CUTENSOR_OP_MUL= 5¶
-
Multiplication of two elements.
-
enumerator
CUTENSOR_OP_MAX= 6¶
-
Maximum of two elements.
-
enumerator
CUTENSOR_OP_MIN= 7¶
-
Minimum of two elements.
-
enumerator
CUTENSOR_OP_UNKNOWN= 126¶
-
reserved for internal use only
cutensorStatus_t¶
-
enum
cutensorStatus_t¶
-
- Brief
-
cuTENSOR status type returns
- Details
-
The type is used for function status returns. All cuTENSOR library functions return their status, which can have the following values.
Values:
-
enumerator
CUTENSOR_STATUS_SUCCESS= 0¶
-
The operation completed successfully.
-
enumerator
CUTENSOR_STATUS_NOT_INITIALIZED= 1¶
-
The cuTENSOR library was not initialized.
-
enumerator
CUTENSOR_STATUS_ALLOC_FAILED= 3¶
-
Resource allocation failed inside the cuTENSOR library.
-
enumerator
CUTENSOR_STATUS_INVALID_VALUE= 7¶
-
An unsupported value or parameter was passed to the function (indicates an user error).
-
enumerator
CUTENSOR_STATUS_ARCH_MISMATCH= 8¶
-
Indicates that the device is either not ready, or the target architecture is not supported.
-
enumerator
CUTENSOR_STATUS_MAPPING_ERROR= 11¶
-
An access to GPU memory space failed, which is usually caused by a failure to bind a texture.
-
enumerator
CUTENSOR_STATUS_EXECUTION_FAILED= 13¶
-
The GPU program failed to execute. This is often caused by a launch failure of the kernel on the GPU, which can be caused by multiple reasons.
-
enumerator
CUTENSOR_STATUS_INTERNAL_ERROR= 14¶
-
An internal cuTENSOR error has occurred.
-
enumerator
CUTENSOR_STATUS_NOT_SUPPORTED= 15¶
-
The requested operation is not supported.
-
enumerator
CUTENSOR_STATUS_LICENSE_ERROR= 16¶
-
The functionality requested requires some license and an error was detected when trying to check the current licensing.
-
enumerator
CUTENSOR_STATUS_CUBLAS_ERROR= 17¶
-
A call to CUBLAS did not succeed.
-
enumerator
CUTENSOR_STATUS_CUDA_ERROR= 18¶
-
Some unknown CUDA error has occurred.
-
enumerator
CUTENSOR_STATUS_INSUFFICIENT_WORKSPACE= 19¶
-
The provided workspace was insufficient.
-
enumerator
CUTENSOR_STATUS_INSUFFICIENT_DRIVER= 20¶
-
Indicates that the driver version is insufficient.
-
enumerator
CUTENSOR_STATUS_IO_ERROR= 21¶
-
Indicates an error related to file I/O.