CUDA Math API :: CUDA Toolkit Documentation

1.1.1. FP8 Conversion and Data Movement

[FP8 Intrinsics]

To use these functions, include the header file cuda_fp8.h in your program.

Typedefs

typedef unsigned char __nv_fp8_storage_t: 8-bit unsignedinteger type abstraction used to for fp8 floating-point numbers storage.
typedef unsigned short int __nv_fp8x2_storage_t: 16-bit unsignedinteger type abstraction used to for storage of pairs of fp8 floating-point numbers.
typedef unsigned int __nv_fp8x4_storage_t: 32-bit unsignedinteger type abstraction used to for storage of tetrads of fp8 floating-point numbers.

Enumerations

enum __nv_fp8_interpretation_t: Enumerates the possible interpretations of the 8-bit values when referring to them as fp8 types.
enum __nv_saturation_t: Enumerates the modes applicable when performing a narrowing conversion to fp8 destination types.

Functions

__host__ __device__ __nv_fp8x2_storage_t __nv_cvt_bfloat16raw2_to_fp8x2 ( const __nv_bfloat162_raw x, const __nv_saturation_t saturate, const __nv_fp8_interpretation_t fp8_interpretation ): Converts input vector of two nv_bfloat16 precision numbers packed in __nv_bfloat162_rawx into a vector of two values of fp8 type of the requested kind using round-to-nearest-even rounding and requested saturation mode.
__host__ __device__ __nv_fp8_storage_t __nv_cvt_bfloat16raw_to_fp8 ( const __nv_bfloat16_raw x, const __nv_saturation_t saturate, const __nv_fp8_interpretation_t fp8_interpretation ): Converts input nv_bfloat16 precision x to fp8 type of the requested kind using round-to-nearest-even rounding and requested saturation mode.
__host__ __device__ __nv_fp8x2_storage_t __nv_cvt_double2_to_fp8x2 ( const double2 x, const __nv_saturation_t saturate, const __nv_fp8_interpretation_t fp8_interpretation ): Converts input vector of two double precision numbers packed in double2x into a vector of two values of fp8 type of the requested kind using round-to-nearest-even rounding and requested saturation mode.
__host__ __device__ __nv_fp8_storage_t __nv_cvt_double_to_fp8 ( const double x, const __nv_saturation_t saturate, const __nv_fp8_interpretation_t fp8_interpretation ): Converts input double precision x to fp8 type of the requested kind using round-to-nearest-even rounding and requested saturation mode.
__host__ __device__ __nv_fp8x2_storage_t __nv_cvt_float2_to_fp8x2 ( const float2 x, const __nv_saturation_t saturate, const __nv_fp8_interpretation_t fp8_interpretation ): Converts input vector of two single precision numbers packed in float2x into a vector of two values of fp8 type of the requested kind using round-to-nearest-even rounding and requested saturation mode.
__host__ __device__ __nv_fp8_storage_t __nv_cvt_float_to_fp8 ( const float x, const __nv_saturation_t saturate, const __nv_fp8_interpretation_t fp8_interpretation ): Converts input single precision x to fp8 type of the requested kind using round-to-nearest-even rounding and requested saturation mode.
__host__ __device__ __half_raw __nv_cvt_fp8_to_halfraw ( const __nv_fp8_storage_t x, const __nv_fp8_interpretation_t fp8_interpretation ): Converts input fp8x of the specified kind to half precision.
__host__ __device__ __half2_raw __nv_cvt_fp8x2_to_halfraw2 ( const __nv_fp8x2_storage_t x, const __nv_fp8_interpretation_t fp8_interpretation ): Converts input vector of two fp8 values of the specified kind to a vector of two half precision values packed in __half2_raw structure.
__host__ __device__ __nv_fp8x2_storage_t __nv_cvt_halfraw2_to_fp8x2 ( const __half2_raw x, const __nv_saturation_t saturate, const __nv_fp8_interpretation_t fp8_interpretation ): Converts input vector of two half precision numbers packed in __half2_rawx into a vector of two values of fp8 type of the requested kind using round-to-nearest-even rounding and requested saturation mode.
__host__ __device__ __nv_fp8_storage_t __nv_cvt_halfraw_to_fp8 ( const __half_raw x, const __nv_saturation_t saturate, const __nv_fp8_interpretation_t fp8_interpretation ): Converts input half precision x to fp8 type of the requested kind using round-to-nearest-even rounding and requested saturation mode.

Typedefs

typedef unsigned char __nv_fp8_storage_t: 8-bit unsignedinteger type abstraction used to for fp8 floating-point numbers storage.
typedef unsigned short int __nv_fp8x2_storage_t: 16-bit unsignedinteger type abstraction used to for storage of pairs of fp8 floating-point numbers.
typedef unsigned int __nv_fp8x4_storage_t: 32-bit unsignedinteger type abstraction used to for storage of tetrads of fp8 floating-point numbers.

Enumerations

enum __nv_fp8_interpretation_t

Values

__NV_E4M3: Stands for fp8 numbers of e4m3 kind.
__NV_E5M2: Stands for fp8 numbers of e5m2 kind.

enum __nv_saturation_t

Values

__NV_NOSAT: Means no saturation to finite is performed when conversion results in rounding values outside the range of destination type. NOTE: for fp8 type of e4m3 kind, the results that are larger than the maximum representable finite number of the target format become NaN.
__NV_SATFINITE: Means input larger than the maximum representable finite number MAXNORM of the target format round to the MAXNORM of the same sign as input.

Functions

__host__ __device__ __nv_fp8x2_storage_t __nv_cvt_bfloat16raw2_to_fp8x2 ( const __nv_bfloat162_raw x, const __nv_saturation_t saturate, const __nv_fp8_interpretation_t fp8_interpretation )

Converts input vector of two nv_bfloat16 precision numbers packed in __nv_bfloat162_rawx into a vector of two values of fp8 type of the requested kind using round-to-nearest-even rounding and requested saturation mode.

Returns

The __nv_fp8x2_storage_t value holds the result of conversion.

Description

Converts input vector x to a vector of two fp8 values of the kind specified by fp8_interpretation parameter, using round-to-nearest-even rounding and saturation mode specified by saturate parameter.

__host__ __device__ __nv_fp8_storage_t __nv_cvt_bfloat16raw_to_fp8 ( const __nv_bfloat16_raw x, const __nv_saturation_t saturate, const __nv_fp8_interpretation_t fp8_interpretation )

Converts input nv_bfloat16 precision x to fp8 type of the requested kind using round-to-nearest-even rounding and requested saturation mode.

Returns

The __nv_fp8_storage_t value holds the result of conversion.

Description

Converts input x to fp8 type of the kind specified by fp8_interpretation parameter, using round-to-nearest-even rounding and saturation mode specified by saturate parameter.

__host__ __device__ __nv_fp8x2_storage_t __nv_cvt_double2_to_fp8x2 ( const double2 x, const __nv_saturation_t saturate, const __nv_fp8_interpretation_t fp8_interpretation )

Converts input vector of two double precision numbers packed in double2x into a vector of two values of fp8 type of the requested kind using round-to-nearest-even rounding and requested saturation mode.

Returns

The __nv_fp8x2_storage_t value holds the result of conversion.

Description

__host__ __device__ __nv_fp8_storage_t __nv_cvt_double_to_fp8 ( const double x, const __nv_saturation_t saturate, const __nv_fp8_interpretation_t fp8_interpretation )

Converts input double precision x to fp8 type of the requested kind using round-to-nearest-even rounding and requested saturation mode.

Returns

The __nv_fp8_storage_t value holds the result of conversion.

Description

Converts input x to fp8 type of the kind specified by fp8_interpretation parameter, using round-to-nearest-even rounding and saturation mode specified by saturate parameter.

__host__ __device__ __nv_fp8x2_storage_t __nv_cvt_float2_to_fp8x2 ( const float2 x, const __nv_saturation_t saturate, const __nv_fp8_interpretation_t fp8_interpretation )

Converts input vector of two single precision numbers packed in float2x into a vector of two values of fp8 type of the requested kind using round-to-nearest-even rounding and requested saturation mode.

Returns

The __nv_fp8x2_storage_t value holds the result of conversion.

Description

__host__ __device__ __nv_fp8_storage_t __nv_cvt_float_to_fp8 ( const float x, const __nv_saturation_t saturate, const __nv_fp8_interpretation_t fp8_interpretation )

Converts input single precision x to fp8 type of the requested kind using round-to-nearest-even rounding and requested saturation mode.

Returns

The __nv_fp8_storage_t value holds the result of conversion.

Description

Converts input x to fp8 type of the kind specified by fp8_interpretation parameter, using round-to-nearest-even rounding and saturation mode specified by saturate parameter.

__host__ __device__ __half_raw __nv_cvt_fp8_to_halfraw ( const __nv_fp8_storage_t x, const __nv_fp8_interpretation_t fp8_interpretation )

Converts input fp8x of the specified kind to half precision.

Returns

The __half_raw value holds the result of conversion.

Description

Converts input x of fp8 type of the kind specified by fp8_interpretation parameter to half precision.

__host__ __device__ __half2_raw __nv_cvt_fp8x2_to_halfraw2 ( const __nv_fp8x2_storage_t x, const __nv_fp8_interpretation_t fp8_interpretation )

Converts input vector of two fp8 values of the specified kind to a vector of two half precision values packed in __half2_raw structure.

Returns

The __half2_raw value holds the result of conversion.

Description

Converts input vector x of fp8 type of the kind specified by fp8_interpretation parameter to a vector of two half precision values and returns as __half2_raw structure.

__host__ __device__ __nv_fp8x2_storage_t __nv_cvt_halfraw2_to_fp8x2 ( const __half2_raw x, const __nv_saturation_t saturate, const __nv_fp8_interpretation_t fp8_interpretation )

Converts input vector of two half precision numbers packed in __half2_rawx into a vector of two values of fp8 type of the requested kind using round-to-nearest-even rounding and requested saturation mode.

Returns

The __nv_fp8x2_storage_t value holds the result of conversion.

Description

__host__ __device__ __nv_fp8_storage_t __nv_cvt_halfraw_to_fp8 ( const __half_raw x, const __nv_saturation_t saturate, const __nv_fp8_interpretation_t fp8_interpretation )

Converts input half precision x to fp8 type of the requested kind using round-to-nearest-even rounding and requested saturation mode.

Returns

The __nv_fp8_storage_t value holds the result of conversion.

Description

Converts input x to fp8 type of the kind specified by fp8_interpretation parameter, using round-to-nearest-even rounding and saturation mode specified by saturate parameter.