1.4. FP4 Conversion and Data Movement
To use these functions, include the header file cuda_fp4.h in your program. 
Enumerations
- __nv_fp4_interpretation_t
- 
Enumerates the possible interpretations of the 4-bit values when referring to them as fp4types.
Functions
- __host__ __device__ __nv_fp4x2_storage_t __nv_cvt_bfloat16raw2_to_fp4x2(const __nv_bfloat162_raw x, const __nv_fp4_interpretation_t fp4_interpretation, const enum cudaRoundMode rounding)
- 
Converts input vector of two nv_bfloat16precision numbers packed in__nv_bfloat162_rawxinto a vector of two values offp4type of the requested kind using specified rounding mode and saturating the out-of-range values.
- __host__ __device__ __nv_fp4_storage_t __nv_cvt_bfloat16raw_to_fp4(const __nv_bfloat16_raw x, const __nv_fp4_interpretation_t fp4_interpretation, const enum cudaRoundMode rounding)
- 
Converts input nv_bfloat16precisionxtofp4type of the requested kind using specified rounding mode and saturating the out-of-range values.
- __host__ __device__ __nv_fp4x2_storage_t __nv_cvt_double2_to_fp4x2(const double2 x, const __nv_fp4_interpretation_t fp4_interpretation, const enum cudaRoundMode rounding)
- 
Converts input vector of two doubleprecision numbers packed indouble2xinto a vector of two values offp4type of the requested kind using specified rounding mode and saturating the out-of-range values.
- __host__ __device__ __nv_fp4_storage_t __nv_cvt_double_to_fp4(const double x, const __nv_fp4_interpretation_t fp4_interpretation, const enum cudaRoundMode rounding)
- 
Converts input doubleprecisionxtofp4type of the requested kind using specified rounding mode and saturating the out-of-range values.
- __host__ __device__ __nv_fp4x2_storage_t __nv_cvt_float2_to_fp4x2(const float2 x, const __nv_fp4_interpretation_t fp4_interpretation, const enum cudaRoundMode rounding)
- 
Converts input vector of two singleprecision numbers packed infloat2xinto a vector of two values offp4type of the requested kind using specified rounding mode and saturating the out-of-range values.
- __host__ __device__ __nv_fp4_storage_t __nv_cvt_float_to_fp4(const float x, const __nv_fp4_interpretation_t fp4_interpretation, const enum cudaRoundMode rounding)
- 
Converts input singleprecisionxtofp4type of the requested kind using specified rounding mode and saturating the out-of-range values.
- __host__ __device__ __half_raw __nv_cvt_fp4_to_halfraw(const __nv_fp4_storage_t x, const __nv_fp4_interpretation_t fp4_interpretation)
- 
Converts input fp4xof the specified kind tohalfprecision.
- __host__ __device__ __half2_raw __nv_cvt_fp4x2_to_halfraw2(const __nv_fp4x2_storage_t x, const __nv_fp4_interpretation_t fp4_interpretation)
- 
Converts input vector of two fp4values of the specified kind to a vector of twohalfprecision values packed in__half2_rawstructure.
- __host__ __device__ __nv_fp4x2_storage_t __nv_cvt_halfraw2_to_fp4x2(const __half2_raw x, const __nv_fp4_interpretation_t fp4_interpretation, const enum cudaRoundMode rounding)
- 
Converts input vector of two halfprecision numbers packed in__half2_rawxinto a vector of two values offp4type of the requested kind using specified rounding mode and saturating the out-of-range values.
- __host__ __device__ __nv_fp4_storage_t __nv_cvt_halfraw_to_fp4(const __half_raw x, const __nv_fp4_interpretation_t fp4_interpretation, const enum cudaRoundMode rounding)
- 
Converts input halfprecisionxtofp4type of the requested kind using specified rounding mode and saturating the out-of-range values.
- __host__ __device__ __nv_fp4_e2m1::__nv_fp4_e2m1()
- 
Constructor by default. 
- __host__ __device__ __nv_fp4_e2m1::__nv_fp4_e2m1(const unsigned long int val)
- 
Constructor from unsignedlongintdata type, relies on__NV_SATFINITEbehavior for out-of-range values.
- __host__ __device__ __nv_fp4_e2m1::__nv_fp4_e2m1(const double f)
- 
Constructor from doubledata type, relies on__NV_SATFINITEbehavior for out-of-range values andcudaRoundNearestrounding mode.
- __host__ __device__ __nv_fp4_e2m1::__nv_fp4_e2m1(const long int val)
- 
Constructor from longintdata type, relies on__NV_SATFINITEbehavior for out-of-range values.
- __host__ __device__ __nv_fp4_e2m1::__nv_fp4_e2m1(const float f)
- 
Constructor from floatdata type, relies on__NV_SATFINITEbehavior for out-of-range values andcudaRoundNearestrounding mode.
- __host__ __device__ __nv_fp4_e2m1::__nv_fp4_e2m1(const int val)
- 
Constructor from intdata type, relies on__NV_SATFINITEbehavior for out-of-range values.
- __host__ __device__ __nv_fp4_e2m1::__nv_fp4_e2m1(const unsigned short int val)
- 
Constructor from unsignedshortintdata type, relies on__NV_SATFINITEbehavior for out-of-range values.
- __host__ __device__ __nv_fp4_e2m1::__nv_fp4_e2m1(const long long int val)
- 
Constructor from longlongintdata type, relies on__NV_SATFINITEbehavior for out-of-range values.
- __host__ __device__ __nv_fp4_e2m1::__nv_fp4_e2m1(const short int val)
- 
Constructor from shortintdata type.
- __host__ __device__ __nv_fp4_e2m1::__nv_fp4_e2m1(const __nv_bfloat16 f)
- 
Constructor from __nv_bfloat16data type, relies on__NV_SATFINITEbehavior for out-of-range values andcudaRoundNearestrounding mode.
- __host__ __device__ __nv_fp4_e2m1::__nv_fp4_e2m1(const unsigned int val)
- 
Constructor from unsignedintdata type, relies on__NV_SATFINITEbehavior for out-of-range values.
- __host__ __device__ __nv_fp4_e2m1::__nv_fp4_e2m1(const unsigned long long int val)
- 
Constructor from unsignedlonglongintdata type, relies on__NV_SATFINITEbehavior for out-of-range values.
- __host__ __device__ __nv_fp4_e2m1::__nv_fp4_e2m1(const __half f)
- 
Constructor from __halfdata type, relies on__NV_SATFINITEbehavior for out-of-range values andcudaRoundNearestrounding mode.
- __host__ __device__ __nv_fp4x2_e2m1::__nv_fp4x2_e2m1(const double2 f)
- 
Constructor from double2data type, relies on__NV_SATFINITEbehavior for out-of-range values.
- __host__ __device__ __nv_fp4x2_e2m1::__nv_fp4x2_e2m1(const __nv_bfloat162 f)
- 
Constructor from __nv_bfloat162data type, relies on__NV_SATFINITEbehavior for out-of-range values.
- __host__ __device__ __nv_fp4x2_e2m1::__nv_fp4x2_e2m1(const __half2 f)
- 
Constructor from __half2data type, relies on__NV_SATFINITEbehavior for out-of-range values.
- __host__ __device__ __nv_fp4x2_e2m1::__nv_fp4x2_e2m1(const float2 f)
- 
Constructor from float2data type, relies on__NV_SATFINITEbehavior for out-of-range values.
- __host__ __device__ __nv_fp4x2_e2m1::__nv_fp4x2_e2m1()
- 
Constructor by default. 
- __host__ __device__ __nv_fp4x4_e2m1::__nv_fp4x4_e2m1()
- 
Constructor by default. 
- __host__ __device__ __nv_fp4x4_e2m1::__nv_fp4x4_e2m1(const __nv_bfloat162 flo, const __nv_bfloat162 fhi)
- 
Constructor from a pair of __nv_bfloat162data type values, relies on__NV_SATFINITEbehavior for out-of-range values.
- __host__ __device__ __nv_fp4x4_e2m1::__nv_fp4x4_e2m1(const double4 f)
- 
Constructor from double4vector data type, relies on__NV_SATFINITEbehavior for out-of-range values.
- __host__ __device__ __nv_fp4x4_e2m1::__nv_fp4x4_e2m1(const float4 f)
- 
Constructor from float4vector data type, relies on__NV_SATFINITEbehavior for out-of-range values.
- __host__ __device__ __nv_fp4x4_e2m1::__nv_fp4x4_e2m1(const __half2 flo, const __half2 fhi)
- 
Constructor from a pair of __half2data type values, relies on__NV_SATFINITEbehavior for out-of-range values.
Typedefs
- __nv_fp4_storage_t
- 
8-bit unsignedintegertype abstraction used forfp4floating-point numbers storage.
- __nv_fp4x2_storage_t
- 
8-bit unsignedintegertype abstraction used for storage of pairs offp4floating-point numbers.
- __nv_fp4x4_storage_t
- 
16-bit unsignedintegertype abstraction used for storage of tetrads offp4floating-point numbers.
1.4.1. Enumerations
1.4.2. Functions
- 
__host__ __device__ __nv_fp4x2_storage_t __nv_cvt_bfloat16raw2_to_fp4x2(const __nv_bfloat162_raw x, const __nv_fp4_interpretation_t fp4_interpretation, const enum cudaRoundMode rounding)
 
- 
Converts input vector of two nv_bfloat16precision numbers packed in__nv_bfloat162_rawxinto a vector of two values offp4type of the requested kind using specified rounding mode and saturating the out-of-range values.Converts input vector xto a vector of twofp4values of the kind specified byfp4_interpretationparameter, using rounding mode specified byroundingparameter. Large out-of-range values saturate to MAXNORM of the same sign.NaNinput values result in positive MAXNORM.- Returns
- 
- The - __nv_fp4x2_storage_tvalue holds the result of conversion.
 
 
- 
__host__ __device__ __nv_fp4_storage_t __nv_cvt_bfloat16raw_to_fp4(const __nv_bfloat16_raw x, const __nv_fp4_interpretation_t fp4_interpretation, const enum cudaRoundMode rounding)
 
- 
Converts input nv_bfloat16precisionxtofp4type of the requested kind using specified rounding mode and saturating the out-of-range values.Converts input xtofp4type of the kind specified byfp4_interpretationparameter, using rounding mode specified byroundingparameter. Large out-of-range values saturate to MAXNORM of the same sign.NaNinput values result in positive MAXNORM.- Returns
- 
- The - __nv_fp4_storage_tvalue holds the result of conversion.
 
 
- 
__host__ __device__ __nv_fp4x2_storage_t __nv_cvt_double2_to_fp4x2(const double2 x, const __nv_fp4_interpretation_t fp4_interpretation, const enum cudaRoundMode rounding)
 
- 
Converts input vector of two doubleprecision numbers packed indouble2xinto a vector of two values offp4type of the requested kind using specified rounding mode and saturating the out-of-range values.Converts input vector xto a vector of twofp4values of the kind specified byfp4_interpretationparameter, using rounding mode specified byroundingparameter. Large out-of-range values saturate to MAXNORM of the same sign.NaNinput values result in positive MAXNORM.- Returns
- 
- The - __nv_fp4x2_storage_tvalue holds the result of conversion.
 
 
- 
__host__ __device__ __nv_fp4_storage_t __nv_cvt_double_to_fp4(const double x, const __nv_fp4_interpretation_t fp4_interpretation, const enum cudaRoundMode rounding)
 
- 
Converts input doubleprecisionxtofp4type of the requested kind using specified rounding mode and saturating the out-of-range values.Converts input xtofp4type of the kind specified byfp4_interpretationparameter, using rounding mode specified byroundingparameter. Large out-of-range values saturate to MAXNORM of the same sign.NaNinput values result in positive MAXNORM.- Returns
- 
- The - __nv_fp4_storage_tvalue holds the result of conversion.
 
 
- 
__host__ __device__ __nv_fp4x2_storage_t __nv_cvt_float2_to_fp4x2(const float2 x, const __nv_fp4_interpretation_t fp4_interpretation, const enum cudaRoundMode rounding)
 
- 
Converts input vector of two singleprecision numbers packed infloat2xinto a vector of two values offp4type of the requested kind using specified rounding mode and saturating the out-of-range values.Converts input vector xto a vector of twofp4values of the kind specified byfp4_interpretationparameter, using rounding mode specified byroundingparameter. Large out-of-range values saturate to MAXNORM of the same sign.NaNinput values result in positive MAXNORM.- Returns
- 
- The - __nv_fp4x2_storage_tvalue holds the result of conversion.
 
 
- 
__host__ __device__ __nv_fp4_storage_t __nv_cvt_float_to_fp4(const float x, const __nv_fp4_interpretation_t fp4_interpretation, const enum cudaRoundMode rounding)
 
- 
Converts input singleprecisionxtofp4type of the requested kind using specified rounding mode and saturating the out-of-range values.Converts input xtofp4type of the kind specified byfp4_interpretationparameter, using rounding mode specified byroundingparameter. Large out-of-range values saturate to MAXNORM of the same sign.NaNinput values result in positive MAXNORM.- Returns
- 
- The - __nv_fp4_storage_tvalue holds the result of conversion.
 
 
- 
__host__ __device__ __half_raw __nv_cvt_fp4_to_halfraw(const __nv_fp4_storage_t x, const __nv_fp4_interpretation_t fp4_interpretation)
 
- 
Converts input fp4xof the specified kind tohalfprecision.Converts input xoffp4type of the kind specified byfp4_interpretationparameter tohalfprecision.- Returns
- 
- The - __half_rawvalue holds the result of conversion.
 
 
- 
__host__ __device__ __half2_raw __nv_cvt_fp4x2_to_halfraw2(const __nv_fp4x2_storage_t x, const __nv_fp4_interpretation_t fp4_interpretation)
 
- 
Converts input vector of two fp4values of the specified kind to a vector of twohalfprecision values packed in__half2_rawstructure.Converts input vector xoffp4type of the kind specified byfp4_interpretationparameter to a vector of twohalfprecision values and returns as__half2_rawstructure.- Returns
- 
- The - __half2_rawvalue holds the result of conversion.
 
 
- 
__host__ __device__ __nv_fp4x2_storage_t __nv_cvt_halfraw2_to_fp4x2(const __half2_raw x, const __nv_fp4_interpretation_t fp4_interpretation, const enum cudaRoundMode rounding)
 
- 
Converts input vector of two halfprecision numbers packed in__half2_rawxinto a vector of two values offp4type of the requested kind using specified rounding mode and saturating the out-of-range values.Converts input vector xto a vector of twofp4values of the kind specified byfp4_interpretationparameter, using rounding mode specified byroundingparameter. Large out-of-range values saturate to MAXNORM of the same sign.NaNinput values result in positive MAXNORM.- Returns
- 
- The - __nv_fp4x2_storage_tvalue holds the result of conversion.
 
 
- 
__host__ __device__ __nv_fp4_storage_t __nv_cvt_halfraw_to_fp4(const __half_raw x, const __nv_fp4_interpretation_t fp4_interpretation, const enum cudaRoundMode rounding)
 
- 
Converts input halfprecisionxtofp4type of the requested kind using specified rounding mode and saturating the out-of-range values.Converts input xtofp4type of the kind specified byfp4_interpretationparameter, using rounding mode specified byroundingparameter. Large out-of-range values saturate to MAXNORM of the same sign.NaNinput values result in positive MAXNORM.- Returns
- 
- The - __nv_fp4_storage_tvalue holds the result of conversion.
 
 
1.4.3. Typedefs
- 
typedef __nv_fp8_storage_t __nv_fp4_storage_t
 
- 
8-bit unsignedintegertype abstraction used forfp4floating-point numbers storage.
- 
typedef __nv_fp8_storage_t __nv_fp4x2_storage_t
 
- 
8-bit unsignedintegertype abstraction used for storage of pairs offp4floating-point numbers.
- 
typedef __nv_fp8x2_storage_t __nv_fp4x4_storage_t
 
- 
16-bit unsignedintegertype abstraction used for storage of tetrads offp4floating-point numbers.