1.2.1. Bfloat16 Arithmetic Functions
[Bfloat16 Precision Intrinsics]
To use these functions, include the header file cuda_bf16.h in your program.
Functions
- __device__  __nv_bfloat162 __h2div ( const __nv_bfloat162 a, const __nv_bfloat162 b )
 - Performs nv_bfloat162 vector division in round-to-nearest-even mode.
 - __device__  __nv_bfloat16 __habs ( const __nv_bfloat16 a )
 - Calculates the absolute value of input nv_bfloat16 number and returns the result.
 - __device__  __nv_bfloat16 __hadd ( const __nv_bfloat16 a, const __nv_bfloat16 b )
 - Performs nv_bfloat16 addition in round-to-nearest-even mode.
 - __device__  __nv_bfloat16 __hadd_sat ( const __nv_bfloat16 a, const __nv_bfloat16 b )
 - Performs nv_bfloat16 addition in round-to-nearest-even mode, with saturation to [0.0, 1.0].
 - __device__  __nv_bfloat16 __hdiv ( const __nv_bfloat16 a, const __nv_bfloat16 b )
 - Performs nv_bfloat16 division in round-to-nearest-even mode.
 - __device__  __nv_bfloat16 __hfma ( const __nv_bfloat16 a, const __nv_bfloat16 b, const __nv_bfloat16 c )
 - Performs nv_bfloat16 fused multiply-add in round-to-nearest-even mode.
 - __device__  __nv_bfloat16 __hfma_relu ( const __nv_bfloat16 a, const __nv_bfloat16 b, const __nv_bfloat16 c )
 - Performs nv_bfloat16 fused multiply-add in round-to-nearest-even mode with relu saturation.
 - __device__  __nv_bfloat16 __hfma_sat ( const __nv_bfloat16 a, const __nv_bfloat16 b, const __nv_bfloat16 c )
 - Performs nv_bfloat16 fused multiply-add in round-to-nearest-even mode, with saturation to [0.0, 1.0].
 - __device__  __nv_bfloat16 __hmul ( const __nv_bfloat16 a, const __nv_bfloat16 b )
 - Performs nv_bfloat16 multiplication in round-to-nearest-even mode.
 - __device__  __nv_bfloat16 __hmul_sat ( const __nv_bfloat16 a, const __nv_bfloat16 b )
 - Performs nv_bfloat16 multiplication in round-to-nearest-even mode, with saturation to [0.0, 1.0].
 - __device__  __nv_bfloat16 __hneg ( const __nv_bfloat16 a )
 - Negates input nv_bfloat16 number and returns the result.
 - __device__  __nv_bfloat16 __hsub ( const __nv_bfloat16 a, const __nv_bfloat16 b )
 - Performs nv_bfloat16 subtraction in round-to-nearest-even mode.
 - __device__  __nv_bfloat16 __hsub_sat ( const __nv_bfloat16 a, const __nv_bfloat16 b )
 - Performs nv_bfloat16 subtraction in round-to-nearest-even mode, with saturation to [0.0, 1.0].
 
Functions
- __device__  __nv_bfloat162 __h2div ( const __nv_bfloat162 a, const __nv_bfloat162 b )
 - 
                           Performs nv_bfloat162 vector division in round-to-nearest-even mode.
Description
Divides nv_bfloat162 input vector a by input vector b in round-to-nearest mode.
 - __device__  __nv_bfloat16 __habs ( const __nv_bfloat16 a )
 - 
                           Calculates the absolute value of input nv_bfloat16 number and returns the result.
Parameters
- a
 - - nv_bfloat16. Is only being read.
 
Returns
nv_bfloat16
- The 
                                       
absolute value of a.
 
Description
Calculates the absolute value of input nv_bfloat16 number and returns the result.
 - __device__  __nv_bfloat16 __hadd ( const __nv_bfloat16 a, const __nv_bfloat16 b )
 - 
                           Performs nv_bfloat16 addition in round-to-nearest-even mode.
Description
Performs nv_bfloat16 addition of inputs a and b, in round-to-nearest-even mode.
 - __device__  __nv_bfloat16 __hadd_sat ( const __nv_bfloat16 a, const __nv_bfloat16 b )
 - 
                           Performs nv_bfloat16 addition in round-to-nearest-even mode, with saturation to [0.0, 1.0].
Parameters
- a
 - - nv_bfloat16. Is only being read.
 - b
 - - nv_bfloat16. Is only being read.
 
Returns
nv_bfloat16
- The 
                                       
sum of a and b, with respect to saturation.
 
Description
Performs nv_bfloat16 add of inputs a and b, in round-to-nearest-even mode, and clamps the result to range [0.0, 1.0]. NaN results are flushed to +0.0.
 - __device__  __nv_bfloat16 __hdiv ( const __nv_bfloat16 a, const __nv_bfloat16 b )
 - 
                           Performs nv_bfloat16 division in round-to-nearest-even mode.
Description
Divides nv_bfloat16 input a by input b in round-to-nearest mode.
 - __device__  __nv_bfloat16 __hfma ( const __nv_bfloat16 a, const __nv_bfloat16 b, const __nv_bfloat16 c )
 - 
                           Performs nv_bfloat16 fused multiply-add in round-to-nearest-even mode.
Description
Performs nv_bfloat16 multiply on inputs a and b, then performs a nv_bfloat16 add of the result with c, rounding the result once in round-to-nearest-even mode.
 - __device__  __nv_bfloat16 __hfma_relu ( const __nv_bfloat16 a, const __nv_bfloat16 b, const __nv_bfloat16 c )
 - 
                           Performs nv_bfloat16 fused multiply-add in round-to-nearest-even mode with relu saturation.
Parameters
- a
 - - nv_bfloat16. Is only being read.
 - b
 - - nv_bfloat16. Is only being read.
 - c
 - - nv_bfloat16. Is only being read.
 
Returns
nv_bfloat16
- The 
                                       
result of fused multiply-add operation on a, b, and c with relu saturation.
 
Description
Performs nv_bfloat16 multiply on inputs a and b, then performs a nv_bfloat16 add of the result with c, rounding the result once in round-to-nearest-even mode. Then negative result is clamped to 0. NaN result is converted to canonical NaN.
 - __device__  __nv_bfloat16 __hfma_sat ( const __nv_bfloat16 a, const __nv_bfloat16 b, const __nv_bfloat16 c )
 - 
                           Performs nv_bfloat16 fused multiply-add in round-to-nearest-even mode, with saturation to [0.0, 1.0].
Parameters
- a
 - - nv_bfloat16. Is only being read.
 - b
 - - nv_bfloat16. Is only being read.
 - c
 - - nv_bfloat16. Is only being read.
 
Returns
nv_bfloat16
- The 
                                       
result of fused multiply-add operation on a, b, and c, with respect to saturation.
 
Description
Performs nv_bfloat16 multiply on inputs a and b, then performs a nv_bfloat16 add of the result with c, rounding the result once in round-to-nearest-even mode, and clamps the result to range [0.0, 1.0]. NaN results are flushed to +0.0.
 - __device__  __nv_bfloat16 __hmul ( const __nv_bfloat16 a, const __nv_bfloat16 b )
 - 
                           Performs nv_bfloat16 multiplication in round-to-nearest-even mode.
Description
Performs nv_bfloat16 multiplication of inputs a and b, in round-to-nearest mode.
 - __device__  __nv_bfloat16 __hmul_sat ( const __nv_bfloat16 a, const __nv_bfloat16 b )
 - 
                           Performs nv_bfloat16 multiplication in round-to-nearest-even mode, with saturation to [0.0, 1.0].
Parameters
- a
 - - nv_bfloat16. Is only being read.
 - b
 - - nv_bfloat16. Is only being read.
 
Returns
nv_bfloat16
- The 
                                       
result of multiplying a and b, with respect to saturation.
 
Description
Performs nv_bfloat16 multiplication of inputs a and b, in round-to-nearest mode, and clamps the result to range [0.0, 1.0]. NaN results are flushed to +0.0.
 - __device__  __nv_bfloat16 __hneg ( const __nv_bfloat16 a )
 - 
                           Negates input nv_bfloat16 number and returns the result.
Description
Negates input nv_bfloat16 number and returns the result.
 - __device__  __nv_bfloat16 __hsub ( const __nv_bfloat16 a, const __nv_bfloat16 b )
 - 
                           Performs nv_bfloat16 subtraction in round-to-nearest-even mode.
Description
Subtracts nv_bfloat16 input b from input a in round-to-nearest mode.
 - __device__  __nv_bfloat16 __hsub_sat ( const __nv_bfloat16 a, const __nv_bfloat16 b )
 - 
                           Performs nv_bfloat16 subtraction in round-to-nearest-even mode, with saturation to [0.0, 1.0].
Parameters
- a
 - - nv_bfloat16. Is only being read.
 - b
 - - nv_bfloat16. Is only being read.
 
Returns
nv_bfloat16
- The 
                                       
result of subtraction of b from a, with respect to saturation.
 
Description
Subtracts nv_bfloat16 input b from input a in round-to-nearest mode, and clamps the result to range [0.0, 1.0]. NaN results are flushed to +0.0.