cuTENSOR Data Types¶
cutensorComputeType_t
¶
-
enum
cutensorComputeType_t
¶
-
- Brief
-
Encodes cuTENSOR’s compute type (see “User Guide - Accuracy Guarantees” for details).
Values:
-
enumerator
CUTENSOR_COMPUTE_16F
= (1U << 0U)¶
-
floating-point: 5-bit exponent and 10-bit mantissa (aka half)
-
enumerator
CUTENSOR_COMPUTE_16BF
= (1U << 10U)¶
-
floating-point: 8-bit exponent and 7-bit mantissa (aka bfloat)
-
enumerator
CUTENSOR_COMPUTE_TF32
= (1U << 12U)¶
-
floating-point: 8-bit exponent and 10-bit mantissa (aka tensor-float-32)
-
enumerator
CUTENSOR_COMPUTE_32F
= (1U << 2U)¶
-
floating-point: 8-bit exponent and 23-bit mantissa (aka float)
-
enumerator
CUTENSOR_COMPUTE_64F
= (1U << 4U)¶
-
floating-point: 11-bit exponent and 52-bit mantissa (aka double)
-
enumerator
CUTENSOR_COMPUTE_8U
= (1U << 6U)¶
-
8-bit unsigned integer
-
enumerator
CUTENSOR_COMPUTE_8I
= (1U << 8U)¶
-
8-bit signed integer
-
enumerator
CUTENSOR_COMPUTE_32U
= (1U << 7U)¶
-
32-bit unsigned integer
-
enumerator
CUTENSOR_COMPUTE_32I
= (1U << 9U)¶
-
32-bit signed integer
-
enumerator
CUTENSOR_R_MIN_16F
= (1U << 0U)¶
-
DEPRECATED (real as a half), please use CUTENSOR_COMPUTE_16F instead.
-
enumerator
CUTENSOR_C_MIN_16F
= (1U << 1U)¶
-
DEPRECATED (complex as a half), please use CUTENSOR_COMPUTE_16F instead.
-
enumerator
CUTENSOR_R_MIN_32F
= (1U << 2U)¶
-
DEPRECATED (real as a float), please use CUTENSOR_COMPUTE_32F instead.
-
enumerator
CUTENSOR_C_MIN_32F
= (1U << 3U)¶
-
DEPRECATED (complex as a float), please use CUTENSOR_COMPUTE_32F instead.
-
enumerator
CUTENSOR_R_MIN_64F
= (1U << 4U)¶
-
DEPRECATED (real as a double), please use CUTENSOR_COMPUTE_64F instead.
-
enumerator
CUTENSOR_C_MIN_64F
= (1U << 5U)¶
-
DEPRECATED (complex as a double), please use CUTENSOR_COMPUTE_64F instead.
-
enumerator
CUTENSOR_R_MIN_8U
= (1U << 6U)¶
-
DEPRECATED (real as a uint8), please use CUTENSOR_COMPUTE_8U instead.
-
enumerator
CUTENSOR_R_MIN_32U
= (1U << 7U)¶
-
DEPRECATED (real as a uint32), please use CUTENSOR_COMPUTE_32U instead.
-
enumerator
CUTENSOR_R_MIN_8I
= (1U << 8U)¶
-
DEPRECATED (real as a int8), please use CUTENSOR_COMPUTE_8I instead.
-
enumerator
CUTENSOR_R_MIN_32I
= (1U << 9U)¶
-
DEPRECATED (real as a int32), please use CUTENSOR_COMPUTE_32I instead.
-
enumerator
CUTENSOR_R_MIN_16BF
= (1U << 10U)¶
-
DEPRECATED (real as a bfloat16), please use CUTENSOR_COMPUTE_16BF instead.
-
enumerator
CUTENSOR_R_MIN_TF32
= (1U << 11U)¶
-
DEPRECATED (real as a tensorfloat32), please use CUTENSOR_COMPUTE_TF32 instead.
-
enumerator
CUTENSOR_C_MIN_TF32
= (1U << 12U)¶
-
DEPRECATED (complex as a tensorfloat32), please use CUTENSOR_COMPUTE_TF32 instead.
cutensorHandle_t
¶
-
struct
cutensorHandle_t
¶
-
- Brief
-
Opaque structure holding cuTENSOR’s library context.
cutensorTensorDescriptor_t
¶
-
struct
cutensorTensorDescriptor_t
¶
-
- Brief
-
Opaque structure representing a tensor descriptor.
cutensorContractionDescriptor_t
¶
-
struct
cutensorContractionDescriptor_t
¶
-
- Brief
-
Opaque structure representing a tensor contraction descriptor.
cutensorContractionDescriptorAttributes_t
¶
-
enum
cutensorContractionDescriptorAttributes_t
¶
-
This enum lists all attributes of a cutensorContractionContraction_t that can be modified.
Values:
-
enumerator
CUTENSOR_CONTRACTION_DESCRIPTOR_TAG
¶
-
uint32_t: enables users to distinguish two identical tensor contractions w.r.t. the sw-managed plan-cache. (default value: 0)
-
enumerator
cutensorContractionFind_t
¶
-
struct
cutensorContractionFind_t
¶
-
- Brief
-
Opaque structure representing a candidate.
cutensorContractionFindAttributes_t
¶
-
enum
cutensorContractionFindAttributes_t
¶
-
This enum lists all attributes of a cutensorContractionFind_t that can be modified.
Values:
-
enumerator
CUTENSOR_CONTRACTION_FIND_AUTOTUNE_MODE
¶
-
cutensorAutotuneMode_t: Determines if the corresponding algrithm/kernel for this plan should be cached.
-
enumerator
CUTENSOR_CONTRACTION_FIND_CACHE_MODE
¶
-
cutensorCacheMode_t: Gives fine control over what is considered a cachehit.
-
enumerator
CUTENSOR_CONTRACTION_FIND_INCREMENTAL_COUNT
¶
-
uint32_t: Only applicable if CUTENSOR_CONTRACTION_FIND_CACHE_MODE is set to CUTENSOR_AUTOTUNE_INCREMENTAL
-
enumerator
cutensorContractionPlan_t
¶
-
struct
cutensorContractionPlan_t
¶
-
- Brief
-
Opaque structure representing a plan.
cutensorAutotuneMode_t
¶
-
enum
cutensorAutotuneMode_t
¶
-
This enum is important w.r.t. cuTENSOR’s caching capability of plans.
Values:
-
enumerator
CUTENSOR_AUTOTUNE_NONE
¶
-
Indicates no autotuning (default); in this case the cache will help to reduce the plan-creation overhead. In the case of a cachehit: the cached plan will be reused, otherwise the plancache will be neglected.
-
enumerator
CUTENSOR_AUTOTUNE_INCREMENTAL
¶
-
Indicates an incremental autotuning (i.e., each invocation of corresponding cutensorInitContractionPlan() will create a plan based on a different algorithm/kernel; the maximum number of kernels that will be tested is defined by the CUTENSOR_CONTRACTION_FIND_INCREMENTAL_COUNT FindAttributes_t). WARNING: If this autotuning mode is selected, then we cannot guarantee bit-wise identical results (since different algorithms could be executed).
-
enumerator
cutensorCacheMode_t
¶
-
enum
cutensorCacheMode_t
¶
-
This enum defines what is considered a cache hit.
Values:
-
enumerator
CUTENSOR_CACHE_MODE_NONE
¶
-
Plan will not be cached.
-
enumerator
CUTENSOR_CACHE_MODE_PEDANTIC
¶
-
All parameters of the corresponding descriptor must be identical to the cached plan (default).
-
enumerator
cutensorAlgo_t
¶
-
enum
cutensorAlgo_t
¶
-
- Brief
-
Allows users to specify the algorithm to be used for performing the tensor contraction.
- Details
-
This enum gives users finer control over which algorithm should be executed by cutensorContraction(); values >= 0 correspond to certain sub-algorithms of GETT.
Values:
-
enumerator
CUTENSOR_ALGO_GETT
= -4¶
-
Choose the GETT algorithm.
-
enumerator
CUTENSOR_ALGO_TGETT
= -3¶
-
Transpose (A or B) + GETT.
-
enumerator
CUTENSOR_ALGO_TTGT
= -2¶
-
Transpose-Transpose-GEMM-Transpose (requires additional memory)
-
enumerator
CUTENSOR_ALGO_DEFAULT
= -1¶
-
Lets the internal heuristic choose.
cutensorWorksizePreference_t
¶
-
enum
cutensorWorksizePreference_t
¶
-
- Brief
-
This enum gives users finer control over the suggested workspace
- Details
-
This enum gives users finer control over the amount of workspace that is suggested by cutensorContractionGetWorkspace
Values:
-
enumerator
CUTENSOR_WORKSPACE_MIN
= 1¶
-
At least one algorithm will be available.
-
enumerator
CUTENSOR_WORKSPACE_RECOMMENDED
= 2¶
-
The most suitable algorithm will be available.
-
enumerator
CUTENSOR_WORKSPACE_MAX
= 3¶
-
All algorithms will be available.
cutensorOperator_t
¶
-
enum
cutensorOperator_t
¶
-
- Brief
-
This enum captures all unary and binary element-wise operations supported by the cuTENSOR library.
Values:
-
enumerator
CUTENSOR_OP_IDENTITY
= 1¶
-
Identity operator (i.e., elements are not changed)
-
enumerator
CUTENSOR_OP_SQRT
= 2¶
-
Square root.
-
enumerator
CUTENSOR_OP_RELU
= 8¶
-
Rectified linear unit.
-
enumerator
CUTENSOR_OP_CONJ
= 9¶
-
Complex conjugate.
-
enumerator
CUTENSOR_OP_RCP
= 10¶
-
Reciprocal.
-
enumerator
CUTENSOR_OP_SIGMOID
= 11¶
-
y=1/(1+exp(-x))
-
enumerator
CUTENSOR_OP_TANH
= 12¶
-
y=tanh(x)
-
enumerator
CUTENSOR_OP_EXP
= 22¶
-
Exponentiation.
-
enumerator
CUTENSOR_OP_LOG
= 23¶
-
Log (base e).
-
enumerator
CUTENSOR_OP_ABS
= 24¶
-
Absolute value.
-
enumerator
CUTENSOR_OP_NEG
= 25¶
-
Negation.
-
enumerator
CUTENSOR_OP_SIN
= 26¶
-
Sine.
-
enumerator
CUTENSOR_OP_COS
= 27¶
-
Cosine.
-
enumerator
CUTENSOR_OP_TAN
= 28¶
-
Tangent.
-
enumerator
CUTENSOR_OP_SINH
= 29¶
-
Hyperbolic sine.
-
enumerator
CUTENSOR_OP_COSH
= 30¶
-
Hyperbolic cosine.
-
enumerator
CUTENSOR_OP_ASIN
= 31¶
-
Inverse sine.
-
enumerator
CUTENSOR_OP_ACOS
= 32¶
-
Inverse cosine.
-
enumerator
CUTENSOR_OP_ATAN
= 33¶
-
Inverse tangent.
-
enumerator
CUTENSOR_OP_ASINH
= 34¶
-
Inverse hyperbolic sine.
-
enumerator
CUTENSOR_OP_ACOSH
= 35¶
-
Inverse hyperbolic cosine.
-
enumerator
CUTENSOR_OP_ATANH
= 36¶
-
Inverse hyperbolic tangent.
-
enumerator
CUTENSOR_OP_CEIL
= 37¶
-
Ceiling.
-
enumerator
CUTENSOR_OP_FLOOR
= 38¶
-
Floor.
-
enumerator
CUTENSOR_OP_ADD
= 3¶
-
Addition of two elements.
-
enumerator
CUTENSOR_OP_MUL
= 5¶
-
Multiplication of two elements.
-
enumerator
CUTENSOR_OP_MAX
= 6¶
-
Maximum of two elements.
-
enumerator
CUTENSOR_OP_MIN
= 7¶
-
Minimum of two elements.
-
enumerator
CUTENSOR_OP_UNKNOWN
= 126¶
-
reserved for internal use only
cutensorStatus_t
¶
-
enum
cutensorStatus_t
¶
-
- Brief
-
cuTENSOR status type returns
- Details
-
The type is used for function status returns. All cuTENSOR library functions return their status, which can have the following values.
Values:
-
enumerator
CUTENSOR_STATUS_SUCCESS
= 0¶
-
The operation completed successfully.
-
enumerator
CUTENSOR_STATUS_NOT_INITIALIZED
= 1¶
-
The cuTENSOR library was not initialized.
-
enumerator
CUTENSOR_STATUS_ALLOC_FAILED
= 3¶
-
Resource allocation failed inside the cuTENSOR library.
-
enumerator
CUTENSOR_STATUS_INVALID_VALUE
= 7¶
-
An unsupported value or parameter was passed to the function (indicates an user error).
-
enumerator
CUTENSOR_STATUS_ARCH_MISMATCH
= 8¶
-
Indicates that the device is either not ready, or the target architecture is not supported.
-
enumerator
CUTENSOR_STATUS_MAPPING_ERROR
= 11¶
-
An access to GPU memory space failed, which is usually caused by a failure to bind a texture.
-
enumerator
CUTENSOR_STATUS_EXECUTION_FAILED
= 13¶
-
The GPU program failed to execute. This is often caused by a launch failure of the kernel on the GPU, which can be caused by multiple reasons.
-
enumerator
CUTENSOR_STATUS_INTERNAL_ERROR
= 14¶
-
An internal cuTENSOR error has occurred.
-
enumerator
CUTENSOR_STATUS_NOT_SUPPORTED
= 15¶
-
The requested operation is not supported.
-
enumerator
CUTENSOR_STATUS_LICENSE_ERROR
= 16¶
-
The functionality requested requires some license and an error was detected when trying to check the current licensing.
-
enumerator
CUTENSOR_STATUS_CUBLAS_ERROR
= 17¶
-
A call to CUBLAS did not succeed.
-
enumerator
CUTENSOR_STATUS_CUDA_ERROR
= 18¶
-
Some unknown CUDA error has occurred.
-
enumerator
CUTENSOR_STATUS_INSUFFICIENT_WORKSPACE
= 19¶
-
The provided workspace was insufficient.
-
enumerator
CUTENSOR_STATUS_INSUFFICIENT_DRIVER
= 20¶
-
Indicates that the driver version is insufficient.
-
enumerator
CUTENSOR_STATUS_IO_ERROR
= 21¶
-
Indicates an error related to file I/O.