1.1.1. FP8 Conversion and Data Movement

[FP8 Intrinsics]

To use these functions, include the header file cuda_fp8.h in your program.

Typedefs

typedef unsigned char  __nv_fp8_storage_t
8-bit unsignedinteger type abstraction used to for fp8 floating-point numbers storage.
typedef unsigned short int  __nv_fp8x2_storage_t
16-bit unsignedinteger type abstraction used to for storage of pairs of fp8 floating-point numbers.
typedef unsigned int  __nv_fp8x4_storage_t
32-bit unsignedinteger type abstraction used to for storage of tetrads of fp8 floating-point numbers.

Enumerations

enum __nv_fp8_interpretation_t
Enumerates the possible interpretations of the 8-bit values when referring to them as fp8 types.
enum __nv_saturation_t
Enumerates the modes applicable when performing a narrowing conversion to fp8 destination types.

Functions

__host____device____nv_fp8x2_storage_t __nv_cvt_bfloat16raw2_to_fp8x2 ( const __nv_bfloat162_raw x, const __nv_saturation_t saturate, const __nv_fp8_interpretation_t fp8_interpretation )
Converts input vector of two nv_bfloat16 precision numbers packed in __nv_bfloat162_rawx into a vector of two values of fp8 type of the requested kind using round-to-nearest-even rounding and requested saturation mode.
__host____device____nv_fp8_storage_t __nv_cvt_bfloat16raw_to_fp8 ( const __nv_bfloat16_raw x, const __nv_saturation_t saturate, const __nv_fp8_interpretation_t fp8_interpretation )
Converts input nv_bfloat16 precision x to fp8 type of the requested kind using round-to-nearest-even rounding and requested saturation mode.
__host____device____nv_fp8x2_storage_t __nv_cvt_double2_to_fp8x2 ( const double2 x, const __nv_saturation_t saturate, const __nv_fp8_interpretation_t fp8_interpretation )
Converts input vector of two double precision numbers packed in double2x into a vector of two values of fp8 type of the requested kind using round-to-nearest-even rounding and requested saturation mode.
__host____device____nv_fp8_storage_t __nv_cvt_double_to_fp8 ( const double  x, const __nv_saturation_t saturate, const __nv_fp8_interpretation_t fp8_interpretation )
Converts input double precision x to fp8 type of the requested kind using round-to-nearest-even rounding and requested saturation mode.
__host____device____nv_fp8x2_storage_t __nv_cvt_float2_to_fp8x2 ( const float2 x, const __nv_saturation_t saturate, const __nv_fp8_interpretation_t fp8_interpretation )
Converts input vector of two single precision numbers packed in float2x into a vector of two values of fp8 type of the requested kind using round-to-nearest-even rounding and requested saturation mode.
__host____device____nv_fp8_storage_t __nv_cvt_float_to_fp8 ( const float  x, const __nv_saturation_t saturate, const __nv_fp8_interpretation_t fp8_interpretation )
Converts input single precision x to fp8 type of the requested kind using round-to-nearest-even rounding and requested saturation mode.
__host____device__ ​ __half_raw __nv_cvt_fp8_to_halfraw ( const __nv_fp8_storage_t x, const __nv_fp8_interpretation_t fp8_interpretation )
Converts input fp8x of the specified kind to half precision.
__host____device__ ​ __half2_raw __nv_cvt_fp8x2_to_halfraw2 ( const __nv_fp8x2_storage_t x, const __nv_fp8_interpretation_t fp8_interpretation )
Converts input vector of two fp8 values of the specified kind to a vector of two half precision values packed in __half2_raw structure.
__host____device____nv_fp8x2_storage_t __nv_cvt_halfraw2_to_fp8x2 ( const __half2_raw x, const __nv_saturation_t saturate, const __nv_fp8_interpretation_t fp8_interpretation )
Converts input vector of two half precision numbers packed in __half2_rawx into a vector of two values of fp8 type of the requested kind using round-to-nearest-even rounding and requested saturation mode.
__host____device____nv_fp8_storage_t __nv_cvt_halfraw_to_fp8 ( const __half_raw x, const __nv_saturation_t saturate, const __nv_fp8_interpretation_t fp8_interpretation )
Converts input half precision x to fp8 type of the requested kind using round-to-nearest-even rounding and requested saturation mode.

Typedefs

typedef unsigned char __nv_fp8_storage_t

8-bit unsignedinteger type abstraction used to for fp8 floating-point numbers storage.

typedef unsigned short int __nv_fp8x2_storage_t

16-bit unsignedinteger type abstraction used to for storage of pairs of fp8 floating-point numbers.

typedef unsigned int __nv_fp8x4_storage_t

32-bit unsignedinteger type abstraction used to for storage of tetrads of fp8 floating-point numbers.

Enumerations

enum __nv_fp8_interpretation_t

Values
__NV_E4M3
Stands for fp8 numbers of e4m3 kind.
__NV_E5M2
Stands for fp8 numbers of e5m2 kind.
enum __nv_saturation_t

Values
__NV_NOSAT
Means no saturation to finite is performed when conversion results in rounding values outside the range of destination type. NOTE: for fp8 type of e4m3 kind, the results that are larger than the maximum representable finite number of the target format become NaN.
__NV_SATFINITE
Means input larger than the maximum representable finite number MAXNORM of the target format round to the MAXNORM of the same sign as input.

Functions

__host____device____nv_fp8x2_storage_t __nv_cvt_bfloat16raw2_to_fp8x2 ( const __nv_bfloat162_raw x, const __nv_saturation_t saturate, const __nv_fp8_interpretation_t fp8_interpretation )
Converts input vector of two nv_bfloat16 precision numbers packed in __nv_bfloat162_rawx into a vector of two values of fp8 type of the requested kind using round-to-nearest-even rounding and requested saturation mode.
Returns

  • The __nv_fp8x2_storage_t value holds the result of conversion.

Description

Converts input vector x to a vector of two fp8 values of the kind specified by fp8_interpretation parameter, using round-to-nearest-even rounding and saturation mode specified by saturate parameter.

__host____device____nv_fp8_storage_t __nv_cvt_bfloat16raw_to_fp8 ( const __nv_bfloat16_raw x, const __nv_saturation_t saturate, const __nv_fp8_interpretation_t fp8_interpretation )
Converts input nv_bfloat16 precision x to fp8 type of the requested kind using round-to-nearest-even rounding and requested saturation mode.
Returns

  • The __nv_fp8_storage_t value holds the result of conversion.

Description

Converts input x to fp8 type of the kind specified by fp8_interpretation parameter, using round-to-nearest-even rounding and saturation mode specified by saturate parameter.

__host____device____nv_fp8x2_storage_t __nv_cvt_double2_to_fp8x2 ( const double2 x, const __nv_saturation_t saturate, const __nv_fp8_interpretation_t fp8_interpretation )
Converts input vector of two double precision numbers packed in double2x into a vector of two values of fp8 type of the requested kind using round-to-nearest-even rounding and requested saturation mode.
Returns

  • The __nv_fp8x2_storage_t value holds the result of conversion.

Description

Converts input vector x to a vector of two fp8 values of the kind specified by fp8_interpretation parameter, using round-to-nearest-even rounding and saturation mode specified by saturate parameter.

__host____device____nv_fp8_storage_t __nv_cvt_double_to_fp8 ( const double  x, const __nv_saturation_t saturate, const __nv_fp8_interpretation_t fp8_interpretation )
Converts input double precision x to fp8 type of the requested kind using round-to-nearest-even rounding and requested saturation mode.
Returns

  • The __nv_fp8_storage_t value holds the result of conversion.

Description

Converts input x to fp8 type of the kind specified by fp8_interpretation parameter, using round-to-nearest-even rounding and saturation mode specified by saturate parameter.

__host____device____nv_fp8x2_storage_t __nv_cvt_float2_to_fp8x2 ( const float2 x, const __nv_saturation_t saturate, const __nv_fp8_interpretation_t fp8_interpretation )
Converts input vector of two single precision numbers packed in float2x into a vector of two values of fp8 type of the requested kind using round-to-nearest-even rounding and requested saturation mode.
Returns

  • The __nv_fp8x2_storage_t value holds the result of conversion.

Description

Converts input vector x to a vector of two fp8 values of the kind specified by fp8_interpretation parameter, using round-to-nearest-even rounding and saturation mode specified by saturate parameter.

__host____device____nv_fp8_storage_t __nv_cvt_float_to_fp8 ( const float  x, const __nv_saturation_t saturate, const __nv_fp8_interpretation_t fp8_interpretation )
Converts input single precision x to fp8 type of the requested kind using round-to-nearest-even rounding and requested saturation mode.
Returns

  • The __nv_fp8_storage_t value holds the result of conversion.

Description

Converts input x to fp8 type of the kind specified by fp8_interpretation parameter, using round-to-nearest-even rounding and saturation mode specified by saturate parameter.

__host____device__ ​ __half_raw __nv_cvt_fp8_to_halfraw ( const __nv_fp8_storage_t x, const __nv_fp8_interpretation_t fp8_interpretation )
Converts input fp8x of the specified kind to half precision.
Returns

  • The __half_raw value holds the result of conversion.

Description

Converts input x of fp8 type of the kind specified by fp8_interpretation parameter to half precision.

__host____device__ ​ __half2_raw __nv_cvt_fp8x2_to_halfraw2 ( const __nv_fp8x2_storage_t x, const __nv_fp8_interpretation_t fp8_interpretation )
Converts input vector of two fp8 values of the specified kind to a vector of two half precision values packed in __half2_raw structure.
Returns

  • The __half2_raw value holds the result of conversion.

Description

Converts input vector x of fp8 type of the kind specified by fp8_interpretation parameter to a vector of two half precision values and returns as __half2_raw structure.

__host____device____nv_fp8x2_storage_t __nv_cvt_halfraw2_to_fp8x2 ( const __half2_raw x, const __nv_saturation_t saturate, const __nv_fp8_interpretation_t fp8_interpretation )
Converts input vector of two half precision numbers packed in __half2_rawx into a vector of two values of fp8 type of the requested kind using round-to-nearest-even rounding and requested saturation mode.
Returns

  • The __nv_fp8x2_storage_t value holds the result of conversion.

Description

Converts input vector x to a vector of two fp8 values of the kind specified by fp8_interpretation parameter, using round-to-nearest-even rounding and saturation mode specified by saturate parameter.

__host____device____nv_fp8_storage_t __nv_cvt_halfraw_to_fp8 ( const __half_raw x, const __nv_saturation_t saturate, const __nv_fp8_interpretation_t fp8_interpretation )
Converts input half precision x to fp8 type of the requested kind using round-to-nearest-even rounding and requested saturation mode.
Returns

  • The __nv_fp8_storage_t value holds the result of conversion.

Description

Converts input x to fp8 type of the kind specified by fp8_interpretation parameter, using round-to-nearest-even rounding and saturation mode specified by saturate parameter.