cuTENSOR Data Types

cutensorComputeType_t

enum cutensorComputeType_t

Brief

Encodes cuTENSOR’s compute type (see “User Guide - Accuracy Guarantees” for details).

Values:

enumerator CUTENSOR_COMPUTE_16F

floating-point: 5-bit exponent and 10-bit mantissa (aka half)

enumerator CUTENSOR_COMPUTE_16BF

floating-point: 8-bit exponent and 7-bit mantissa (aka bfloat)

enumerator CUTENSOR_COMPUTE_TF32

floating-point: 8-bit exponent and 10-bit mantissa (aka tensor-float-32)

enumerator CUTENSOR_COMPUTE_32F

floating-point: 8-bit exponent and 23-bit mantissa (aka float)

enumerator CUTENSOR_COMPUTE_64F

floating-point: 11-bit exponent and 52-bit mantissa (aka double)

enumerator CUTENSOR_COMPUTE_8U

8-bit unsigned integer

enumerator CUTENSOR_COMPUTE_8I

8-bit signed integer

enumerator CUTENSOR_COMPUTE_32U

32-bit unsigned integer

enumerator CUTENSOR_COMPUTE_32I

32-bit signed integer

enumerator CUTENSOR_R_MIN_16F

DEPRECATED (real as a half), please use CUTENSOR_COMPUTE_16F instead.

enumerator CUTENSOR_C_MIN_16F

DEPRECATED (complex as a half), please use CUTENSOR_COMPUTE_16F instead.

enumerator CUTENSOR_R_MIN_32F

DEPRECATED (real as a float), please use CUTENSOR_COMPUTE_32F instead.

enumerator CUTENSOR_C_MIN_32F

DEPRECATED (complex as a float), please use CUTENSOR_COMPUTE_32F instead.

enumerator CUTENSOR_R_MIN_64F

DEPRECATED (real as a double), please use CUTENSOR_COMPUTE_64F instead.

enumerator CUTENSOR_C_MIN_64F

DEPRECATED (complex as a double), please use CUTENSOR_COMPUTE_64F instead.

enumerator CUTENSOR_R_MIN_8U

DEPRECATED (real as a uint8), please use CUTENSOR_COMPUTE_8U instead.

enumerator CUTENSOR_R_MIN_32U

DEPRECATED (real as a uint32), please use CUTENSOR_COMPUTE_32U instead.

enumerator CUTENSOR_R_MIN_8I

DEPRECATED (real as a int8), please use CUTENSOR_COMPUTE_8I instead.

enumerator CUTENSOR_R_MIN_32I

DEPRECATED (real as a int32), please use CUTENSOR_COMPUTE_32I instead.

enumerator CUTENSOR_R_MIN_16BF

DEPRECATED (real as a bfloat16), please use CUTENSOR_COMPUTE_16BF instead.

enumerator CUTENSOR_R_MIN_TF32

DEPRECATED (real as a tensorfloat32), please use CUTENSOR_COMPUTE_TF32 instead.

enumerator CUTENSOR_C_MIN_TF32

DEPRECATED (complex as a tensorfloat32), please use CUTENSOR_COMPUTE_TF32 instead.


cutensorHandle_t

struct cutensorHandle_t

Brief

Opaque structure holding cuTENSOR’s library context.


cutensorTensorDescriptor_t

struct cutensorTensorDescriptor_t

Brief

Opaque structure representing a tensor descriptor.


cutensorContractionDescriptor_t

struct cutensorContractionDescriptor_t

Brief

Opaque structure representing a tensor contraction descriptor.


cutensorContractionDescriptorAttributes_t

enum cutensorContractionDescriptorAttributes_t

This enum lists all attributes of a cutensorContractionDescriptor_t that can be modified.

Values:

enumerator CUTENSOR_CONTRACTION_DESCRIPTOR_TAG

uint32_t: enables users to distinguish two identical tensor contractions w.r.t. the sw-managed plan-cache. (default value: 0)


cutensorContractionFind_t

struct cutensorContractionFind_t

Brief

Opaque structure representing a candidate.


cutensorContractionFindAttributes_t

enum cutensorContractionFindAttributes_t

This enum lists all attributes of a cutensorContractionFind_t that can be modified.

Values:

enumerator CUTENSOR_CONTRACTION_FIND_AUTOTUNE_MODE

cutensorAutotuneMode_t: Determines if the corresponding algrithm/kernel for this plan should be cached.

enumerator CUTENSOR_CONTRACTION_FIND_CACHE_MODE

cutensorCacheMode_t: Gives fine control over what is considered a cachehit.

enumerator CUTENSOR_CONTRACTION_FIND_INCREMENTAL_COUNT

uint32_t: Only applicable if CUTENSOR_CONTRACTION_FIND_CACHE_MODE is set to CUTENSOR_AUTOTUNE_INCREMENTAL


cutensorContractionPlan_t

struct cutensorContractionPlan_t

Brief

Opaque structure representing a plan.


cutensorAutotuneMode_t

enum cutensorAutotuneMode_t

This enum is important w.r.t. cuTENSOR’s caching capability of plans.

Values:

enumerator CUTENSOR_AUTOTUNE_NONE

Indicates no autotuning (default); in this case the cache will help to reduce the plan-creation overhead. In the case of a cachehit: the cached plan will be reused, otherwise the plancache will be neglected.

enumerator CUTENSOR_AUTOTUNE_INCREMENTAL

Indicates an incremental autotuning (i.e., each invocation of corresponding cutensorInitContractionPlan() will create a plan based on a different algorithm/kernel; the maximum number of kernels that will be tested is defined by the CUTENSOR_CONTRACTION_FIND_INCREMENTAL_COUNT FindAttributes_t). WARNING: If this autotuning mode is selected, then we cannot guarantee bit-wise identical results (since different algorithms could be executed).


cutensorCacheMode_t

enum cutensorCacheMode_t

This enum defines what is considered a cache hit.

Values:

enumerator CUTENSOR_CACHE_MODE_NONE

Plan will not be cached.

enumerator CUTENSOR_CACHE_MODE_PEDANTIC

All parameters of the corresponding descriptor must be identical to the cached plan (default).


cutensorAlgo_t

enum cutensorAlgo_t

Brief

Allows users to specify the algorithm to be used for performing the tensor contraction.

Details

This enum gives users finer control over which algorithm should be executed by cutensorContraction(); values >= 0 correspond to certain sub-algorithms of GETT.

Values:

enumerator CUTENSOR_ALGO_DEFAULT_PATIENT

Uses the more accurate but also more time-consuming performance model.

enumerator CUTENSOR_ALGO_GETT

Choose the GETT algorithm.

enumerator CUTENSOR_ALGO_TGETT

Transpose (A or B) + GETT.

enumerator CUTENSOR_ALGO_TTGT

Transpose-Transpose-GEMM-Transpose (requires additional memory)

enumerator CUTENSOR_ALGO_DEFAULT

Lets the internal heuristic choose.


cutensorWorksizePreference_t

enum cutensorWorksizePreference_t

Brief

This enum gives users finer control over the suggested workspace

Details

This enum gives users finer control over the amount of workspace that is suggested by cutensorContractionGetWorkspace

Values:

enumerator CUTENSOR_WORKSPACE_MIN

At least one algorithm will be available.

enumerator CUTENSOR_WORKSPACE_RECOMMENDED

The most suitable algorithm will be available.

enumerator CUTENSOR_WORKSPACE_MAX

All algorithms will be available.


cutensorOperator_t

enum cutensorOperator_t

Brief

This enum captures all unary and binary element-wise operations supported by the cuTENSOR library.

Values:

enumerator CUTENSOR_OP_IDENTITY

Identity operator (i.e., elements are not changed)

enumerator CUTENSOR_OP_SQRT

Square root.

enumerator CUTENSOR_OP_RELU

Rectified linear unit.

enumerator CUTENSOR_OP_CONJ

Complex conjugate.

enumerator CUTENSOR_OP_RCP

Reciprocal.

enumerator CUTENSOR_OP_SIGMOID

y=1/(1+exp(-x))

enumerator CUTENSOR_OP_TANH

y=tanh(x)

enumerator CUTENSOR_OP_EXP

Exponentiation.

enumerator CUTENSOR_OP_LOG

Log (base e).

enumerator CUTENSOR_OP_ABS

Absolute value.

enumerator CUTENSOR_OP_NEG

Negation.

enumerator CUTENSOR_OP_SIN

Sine.

enumerator CUTENSOR_OP_COS

Cosine.

enumerator CUTENSOR_OP_TAN

Tangent.

enumerator CUTENSOR_OP_SINH

Hyperbolic sine.

enumerator CUTENSOR_OP_COSH

Hyperbolic cosine.

enumerator CUTENSOR_OP_ASIN

Inverse sine.

enumerator CUTENSOR_OP_ACOS

Inverse cosine.

enumerator CUTENSOR_OP_ATAN

Inverse tangent.

enumerator CUTENSOR_OP_ASINH

Inverse hyperbolic sine.

enumerator CUTENSOR_OP_ACOSH

Inverse hyperbolic cosine.

enumerator CUTENSOR_OP_ATANH

Inverse hyperbolic tangent.

enumerator CUTENSOR_OP_CEIL

Ceiling.

enumerator CUTENSOR_OP_FLOOR

Floor.

enumerator CUTENSOR_OP_ADD

Addition of two elements.

enumerator CUTENSOR_OP_MUL

Multiplication of two elements.

enumerator CUTENSOR_OP_MAX

Maximum of two elements.

enumerator CUTENSOR_OP_MIN

Minimum of two elements.

enumerator CUTENSOR_OP_UNKNOWN

reserved for internal use only


cutensorStatus_t

enum cutensorStatus_t

Brief

cuTENSOR status type returns

Details

The type is used for function status returns. All cuTENSOR library functions return their status, which can have the following values.

Values:

enumerator CUTENSOR_STATUS_SUCCESS

The operation completed successfully.

enumerator CUTENSOR_STATUS_NOT_INITIALIZED

The opaque data structure was not initialized.

enumerator CUTENSOR_STATUS_ALLOC_FAILED

Resource allocation failed inside the cuTENSOR library.

enumerator CUTENSOR_STATUS_INVALID_VALUE

An unsupported value or parameter was passed to the function (indicates an user error).

enumerator CUTENSOR_STATUS_ARCH_MISMATCH

Indicates that the device is either not ready, or the target architecture is not supported.

enumerator CUTENSOR_STATUS_MAPPING_ERROR

An access to GPU memory space failed, which is usually caused by a failure to bind a texture.

enumerator CUTENSOR_STATUS_EXECUTION_FAILED

The GPU program failed to execute. This is often caused by a launch failure of the kernel on the GPU, which can be caused by multiple reasons.

enumerator CUTENSOR_STATUS_INTERNAL_ERROR

An internal cuTENSOR error has occurred.

enumerator CUTENSOR_STATUS_NOT_SUPPORTED

The requested operation is not supported.

enumerator CUTENSOR_STATUS_LICENSE_ERROR

The functionality requested requires some license and an error was detected when trying to check the current licensing.

enumerator CUTENSOR_STATUS_CUBLAS_ERROR

A call to CUBLAS did not succeed.

enumerator CUTENSOR_STATUS_CUDA_ERROR

Some unknown CUDA error has occurred.

enumerator CUTENSOR_STATUS_INSUFFICIENT_WORKSPACE

The provided workspace was insufficient.

enumerator CUTENSOR_STATUS_INSUFFICIENT_DRIVER

Indicates that the driver version is insufficient.

enumerator CUTENSOR_STATUS_IO_ERROR

Indicates an error related to file I/O.

cudaDataType_t

enum cudaDataType_t

cudaDataType_t is an enumeration of the types supported by CUDA libraries. cuTENSOR supports real FP16, BF16, FP32 and FP64 as well as complex FP32 and FP64 input types.

Values:

enumerator CUDA_R_16F

16-bit real half precision floating-point type

enumerator CUDA_R_16BF

16-bit real BF16 floating-point type

enumerator CUDA_R_32F

32-bit real single precision floating-point type

enumerator CUDA_C_32F

32-bit complex single precision floating-point type (represented as pair of real and imaginary part)

enumerator CUDA_R_64F

64-bit real double precision floating-point type

enumerator CUDA_C_64F

64-bit complex double precision floating-point type (represented as pair of real and imaginary part)