cuTENSOR Data Types¶
cutensorComputeType_t
¶
-
enum cutensorComputeType_t¶
- Brief
Encodes cuTENSOR’s compute type (see “User Guide - Accuracy Guarantees” for details).
Values:
-
enumerator CUTENSOR_COMPUTE_16F¶
floating-point: 5-bit exponent and 10-bit mantissa (aka half)
-
enumerator CUTENSOR_COMPUTE_16BF¶
floating-point: 8-bit exponent and 7-bit mantissa (aka bfloat)
-
enumerator CUTENSOR_COMPUTE_TF32¶
floating-point: 8-bit exponent and 10-bit mantissa (aka tensor-float-32)
-
enumerator CUTENSOR_COMPUTE_32F¶
floating-point: 8-bit exponent and 23-bit mantissa (aka float)
-
enumerator CUTENSOR_COMPUTE_64F¶
floating-point: 11-bit exponent and 52-bit mantissa (aka double)
-
enumerator CUTENSOR_COMPUTE_8U¶
8-bit unsigned integer
-
enumerator CUTENSOR_COMPUTE_8I¶
8-bit signed integer
-
enumerator CUTENSOR_COMPUTE_32U¶
32-bit unsigned integer
-
enumerator CUTENSOR_COMPUTE_32I¶
32-bit signed integer
-
enumerator CUTENSOR_R_MIN_16F¶
DEPRECATED (real as a half), please use CUTENSOR_COMPUTE_16F instead.
-
enumerator CUTENSOR_C_MIN_16F¶
DEPRECATED (complex as a half), please use CUTENSOR_COMPUTE_16F instead.
-
enumerator CUTENSOR_R_MIN_32F¶
DEPRECATED (real as a float), please use CUTENSOR_COMPUTE_32F instead.
-
enumerator CUTENSOR_C_MIN_32F¶
DEPRECATED (complex as a float), please use CUTENSOR_COMPUTE_32F instead.
-
enumerator CUTENSOR_R_MIN_64F¶
DEPRECATED (real as a double), please use CUTENSOR_COMPUTE_64F instead.
-
enumerator CUTENSOR_C_MIN_64F¶
DEPRECATED (complex as a double), please use CUTENSOR_COMPUTE_64F instead.
-
enumerator CUTENSOR_R_MIN_8U¶
DEPRECATED (real as a uint8), please use CUTENSOR_COMPUTE_8U instead.
-
enumerator CUTENSOR_R_MIN_32U¶
DEPRECATED (real as a uint32), please use CUTENSOR_COMPUTE_32U instead.
-
enumerator CUTENSOR_R_MIN_8I¶
DEPRECATED (real as a int8), please use CUTENSOR_COMPUTE_8I instead.
-
enumerator CUTENSOR_R_MIN_32I¶
DEPRECATED (real as a int32), please use CUTENSOR_COMPUTE_32I instead.
-
enumerator CUTENSOR_R_MIN_16BF¶
DEPRECATED (real as a bfloat16), please use CUTENSOR_COMPUTE_16BF instead.
-
enumerator CUTENSOR_R_MIN_TF32¶
DEPRECATED (real as a tensorfloat32), please use CUTENSOR_COMPUTE_TF32 instead.
-
enumerator CUTENSOR_C_MIN_TF32¶
DEPRECATED (complex as a tensorfloat32), please use CUTENSOR_COMPUTE_TF32 instead.
cutensorHandle_t
¶
-
struct cutensorHandle_t¶
- Brief
Opaque structure holding cuTENSOR’s library context.
cutensorTensorDescriptor_t
¶
-
struct cutensorTensorDescriptor_t¶
- Brief
Opaque structure representing a tensor descriptor.
cutensorContractionDescriptor_t
¶
-
struct cutensorContractionDescriptor_t¶
- Brief
Opaque structure representing a tensor contraction descriptor.
cutensorContractionDescriptorAttributes_t
¶
-
enum cutensorContractionDescriptorAttributes_t¶
This enum lists all attributes of a cutensorContractionDescriptor_t that can be modified.
Values:
-
enumerator CUTENSOR_CONTRACTION_DESCRIPTOR_TAG¶
uint32_t: enables users to distinguish two identical tensor contractions w.r.t. the sw-managed plan-cache. (default value: 0)
-
enumerator CUTENSOR_CONTRACTION_DESCRIPTOR_TAG¶
cutensorContractionFind_t
¶
-
struct cutensorContractionFind_t¶
- Brief
Opaque structure representing a candidate.
cutensorContractionFindAttributes_t
¶
-
enum cutensorContractionFindAttributes_t¶
This enum lists all attributes of a cutensorContractionFind_t that can be modified.
Values:
-
enumerator CUTENSOR_CONTRACTION_FIND_AUTOTUNE_MODE¶
cutensorAutotuneMode_t: Determines if the corresponding algrithm/kernel for this plan should be cached.
-
enumerator CUTENSOR_CONTRACTION_FIND_CACHE_MODE¶
cutensorCacheMode_t: Gives fine control over what is considered a cachehit.
-
enumerator CUTENSOR_CONTRACTION_FIND_INCREMENTAL_COUNT¶
uint32_t: Only applicable if CUTENSOR_CONTRACTION_FIND_CACHE_MODE is set to CUTENSOR_AUTOTUNE_INCREMENTAL
-
enumerator CUTENSOR_CONTRACTION_FIND_AUTOTUNE_MODE¶
cutensorContractionPlan_t
¶
-
struct cutensorContractionPlan_t¶
- Brief
Opaque structure representing a plan.
cutensorAutotuneMode_t
¶
-
enum cutensorAutotuneMode_t¶
This enum is important w.r.t. cuTENSOR’s caching capability of plans.
Values:
-
enumerator CUTENSOR_AUTOTUNE_NONE¶
Indicates no autotuning (default); in this case the cache will help to reduce the plan-creation overhead. In the case of a cachehit: the cached plan will be reused, otherwise the plancache will be neglected.
-
enumerator CUTENSOR_AUTOTUNE_INCREMENTAL¶
Indicates an incremental autotuning (i.e., each invocation of corresponding cutensorInitContractionPlan() will create a plan based on a different algorithm/kernel; the maximum number of kernels that will be tested is defined by the CUTENSOR_CONTRACTION_FIND_INCREMENTAL_COUNT FindAttributes_t). WARNING: If this autotuning mode is selected, then we cannot guarantee bit-wise identical results (since different algorithms could be executed).
-
enumerator CUTENSOR_AUTOTUNE_NONE¶
cutensorCacheMode_t
¶
-
enum cutensorCacheMode_t¶
This enum defines what is considered a cache hit.
Values:
-
enumerator CUTENSOR_CACHE_MODE_NONE¶
Plan will not be cached.
-
enumerator CUTENSOR_CACHE_MODE_PEDANTIC¶
All parameters of the corresponding descriptor must be identical to the cached plan (default).
-
enumerator CUTENSOR_CACHE_MODE_NONE¶
cutensorAlgo_t
¶
-
enum cutensorAlgo_t¶
- Brief
Allows users to specify the algorithm to be used for performing the tensor contraction.
- Details
This enum gives users finer control over which algorithm should be executed by cutensorContraction(); values >= 0 correspond to certain sub-algorithms of GETT.
Values:
-
enumerator CUTENSOR_ALGO_DEFAULT_PATIENT¶
Uses the more accurate but also more time-consuming performance model.
-
enumerator CUTENSOR_ALGO_GETT¶
Choose the GETT algorithm.
-
enumerator CUTENSOR_ALGO_TGETT¶
Transpose (A or B) + GETT.
-
enumerator CUTENSOR_ALGO_TTGT¶
Transpose-Transpose-GEMM-Transpose (requires additional memory)
-
enumerator CUTENSOR_ALGO_DEFAULT¶
Lets the internal heuristic choose.
cutensorWorksizePreference_t
¶
-
enum cutensorWorksizePreference_t¶
- Brief
This enum gives users finer control over the suggested workspace
- Details
This enum gives users finer control over the amount of workspace that is suggested by cutensorContractionGetWorkspace
Values:
-
enumerator CUTENSOR_WORKSPACE_MIN¶
At least one algorithm will be available.
-
enumerator CUTENSOR_WORKSPACE_RECOMMENDED¶
The most suitable algorithm will be available.
-
enumerator CUTENSOR_WORKSPACE_MAX¶
All algorithms will be available.
cutensorOperator_t
¶
-
enum cutensorOperator_t¶
- Brief
This enum captures all unary and binary element-wise operations supported by the cuTENSOR library.
Values:
-
enumerator CUTENSOR_OP_IDENTITY¶
Identity operator (i.e., elements are not changed)
-
enumerator CUTENSOR_OP_SQRT¶
Square root.
-
enumerator CUTENSOR_OP_RELU¶
Rectified linear unit.
-
enumerator CUTENSOR_OP_CONJ¶
Complex conjugate.
-
enumerator CUTENSOR_OP_RCP¶
Reciprocal.
-
enumerator CUTENSOR_OP_SIGMOID¶
y=1/(1+exp(-x))
-
enumerator CUTENSOR_OP_TANH¶
y=tanh(x)
-
enumerator CUTENSOR_OP_EXP¶
Exponentiation.
-
enumerator CUTENSOR_OP_LOG¶
Log (base e).
-
enumerator CUTENSOR_OP_ABS¶
Absolute value.
-
enumerator CUTENSOR_OP_NEG¶
Negation.
-
enumerator CUTENSOR_OP_SIN¶
Sine.
-
enumerator CUTENSOR_OP_COS¶
Cosine.
-
enumerator CUTENSOR_OP_TAN¶
Tangent.
-
enumerator CUTENSOR_OP_SINH¶
Hyperbolic sine.
-
enumerator CUTENSOR_OP_COSH¶
Hyperbolic cosine.
-
enumerator CUTENSOR_OP_ASIN¶
Inverse sine.
-
enumerator CUTENSOR_OP_ACOS¶
Inverse cosine.
-
enumerator CUTENSOR_OP_ATAN¶
Inverse tangent.
-
enumerator CUTENSOR_OP_ASINH¶
Inverse hyperbolic sine.
-
enumerator CUTENSOR_OP_ACOSH¶
Inverse hyperbolic cosine.
-
enumerator CUTENSOR_OP_ATANH¶
Inverse hyperbolic tangent.
-
enumerator CUTENSOR_OP_CEIL¶
Ceiling.
-
enumerator CUTENSOR_OP_FLOOR¶
Floor.
-
enumerator CUTENSOR_OP_ADD¶
Addition of two elements.
-
enumerator CUTENSOR_OP_MUL¶
Multiplication of two elements.
-
enumerator CUTENSOR_OP_MAX¶
Maximum of two elements.
-
enumerator CUTENSOR_OP_MIN¶
Minimum of two elements.
-
enumerator CUTENSOR_OP_UNKNOWN¶
reserved for internal use only
cutensorStatus_t
¶
-
enum cutensorStatus_t¶
- Brief
cuTENSOR status type returns
- Details
The type is used for function status returns. All cuTENSOR library functions return their status, which can have the following values.
Values:
-
enumerator CUTENSOR_STATUS_SUCCESS¶
The operation completed successfully.
-
enumerator CUTENSOR_STATUS_NOT_INITIALIZED¶
The opaque data structure was not initialized.
-
enumerator CUTENSOR_STATUS_ALLOC_FAILED¶
Resource allocation failed inside the cuTENSOR library.
-
enumerator CUTENSOR_STATUS_INVALID_VALUE¶
An unsupported value or parameter was passed to the function (indicates an user error).
-
enumerator CUTENSOR_STATUS_ARCH_MISMATCH¶
Indicates that the device is either not ready, or the target architecture is not supported.
-
enumerator CUTENSOR_STATUS_MAPPING_ERROR¶
An access to GPU memory space failed, which is usually caused by a failure to bind a texture.
-
enumerator CUTENSOR_STATUS_EXECUTION_FAILED¶
The GPU program failed to execute. This is often caused by a launch failure of the kernel on the GPU, which can be caused by multiple reasons.
-
enumerator CUTENSOR_STATUS_INTERNAL_ERROR¶
An internal cuTENSOR error has occurred.
-
enumerator CUTENSOR_STATUS_NOT_SUPPORTED¶
The requested operation is not supported.
-
enumerator CUTENSOR_STATUS_LICENSE_ERROR¶
The functionality requested requires some license and an error was detected when trying to check the current licensing.
-
enumerator CUTENSOR_STATUS_CUBLAS_ERROR¶
A call to CUBLAS did not succeed.
-
enumerator CUTENSOR_STATUS_CUDA_ERROR¶
Some unknown CUDA error has occurred.
-
enumerator CUTENSOR_STATUS_INSUFFICIENT_WORKSPACE¶
The provided workspace was insufficient.
-
enumerator CUTENSOR_STATUS_INSUFFICIENT_DRIVER¶
Indicates that the driver version is insufficient.
-
enumerator CUTENSOR_STATUS_IO_ERROR¶
Indicates an error related to file I/O.
cudaDataType_t
¶
-
enum cudaDataType_t¶
cudaDataType_t is an enumeration of the types supported by CUDA libraries. cuTENSOR supports real FP16, BF16, FP32 and FP64 as well as complex FP32 and FP64 input types.
Values:
-
enumerator CUDA_R_16F¶
16-bit real half precision floating-point type
-
enumerator CUDA_R_16BF¶
16-bit real BF16 floating-point type
-
enumerator CUDA_R_32F¶
32-bit real single precision floating-point type
-
enumerator CUDA_C_32F¶
32-bit complex single precision floating-point type (represented as pair of real and imaginary part)
-
enumerator CUDA_R_64F¶
64-bit real double precision floating-point type
-
enumerator CUDA_C_64F¶
64-bit complex double precision floating-point type (represented as pair of real and imaginary part)
-
enumerator CUDA_R_16F¶
cutensorLoggerCallback_t
¶
-
typedef void (*cutensorLoggerCallback_t)(int32_t logLevel, const char *functionName, const char *message)¶
- Brief
A function pointer type for logging.