1.2.2. Bfloat162 Arithmetic Functions

[Bfloat16 Precision Intrinsics]

To use these functions include the header file cuda_bf16.h in your program.

Functions

__device__ ​ __nv_bfloat162 __habs2 ( const __nv_bfloat162 a )
Calculates the absolute value of both halves of the input nv_bfloat162 number and returns the result.
__device__ ​ __nv_bfloat162 __hadd2 ( const __nv_bfloat162 a, const __nv_bfloat162 b )
Performs nv_bfloat162 vector addition in round-to-nearest-even mode.
__device__ ​ __nv_bfloat162 __hadd2_sat ( const __nv_bfloat162 a, const __nv_bfloat162 b )
Performs nv_bfloat162 vector addition in round-to-nearest-even mode, with saturation to [0.0, 1.0].
__device__ ​ __nv_bfloat162 __hfma2 ( const __nv_bfloat162 a, const __nv_bfloat162 b, const __nv_bfloat162 c )
Performs nv_bfloat162 vector fused multiply-add in round-to-nearest-even mode.
__device__ ​ __nv_bfloat162 __hfma2_relu ( const __nv_bfloat162 a, const __nv_bfloat162 b, const __nv_bfloat162 c )
Performs nv_bfloat162 vector fused multiply-add in round-to-nearest-even mode with relu saturation.
__device__ ​ __nv_bfloat162 __hfma2_sat ( const __nv_bfloat162 a, const __nv_bfloat162 b, const __nv_bfloat162 c )
Performs nv_bfloat162 vector fused multiply-add in round-to-nearest-even mode, with saturation to [0.0, 1.0].
__device__ ​ __nv_bfloat162 __hmul2 ( const __nv_bfloat162 a, const __nv_bfloat162 b )
Performs nv_bfloat162 vector multiplication in round-to-nearest-even mode.
__device__ ​ __nv_bfloat162 __hmul2_sat ( const __nv_bfloat162 a, const __nv_bfloat162 b )
Performs nv_bfloat162 vector multiplication in round-to-nearest-even mode, with saturation to [0.0, 1.0].
__device__ ​ __nv_bfloat162 __hneg2 ( const __nv_bfloat162 a )
Negates both halves of the input nv_bfloat162 number and returns the result.
__device__ ​ __nv_bfloat162 __hsub2 ( const __nv_bfloat162 a, const __nv_bfloat162 b )
Performs nv_bfloat162 vector subtraction in round-to-nearest-even mode.
__device__ ​ __nv_bfloat162 __hsub2_sat ( const __nv_bfloat162 a, const __nv_bfloat162 b )
Performs nv_bfloat162 vector subtraction in round-to-nearest-even mode, with saturation to [0.0, 1.0].

Functions

__device__ ​ __nv_bfloat162 __habs2 ( const __nv_bfloat162 a )
Calculates the absolute value of both halves of the input nv_bfloat162 number and returns the result.
Parameters
a
- nv_bfloat162. Is only being read.
Returns

bfloat2

  • Returns

    a with the absolute value of both halves.

Description

Calculates the absolute value of both halves of the input nv_bfloat162 number and returns the result.

__device__ ​ __nv_bfloat162 __hadd2 ( const __nv_bfloat162 a, const __nv_bfloat162 b )
Performs nv_bfloat162 vector addition in round-to-nearest-even mode.
Description

Performs nv_bfloat162 vector add of inputs a and b, in round-to-nearest mode.

__device__ ​ __nv_bfloat162 __hadd2_sat ( const __nv_bfloat162 a, const __nv_bfloat162 b )
Performs nv_bfloat162 vector addition in round-to-nearest-even mode, with saturation to [0.0, 1.0].
Parameters
a
- nv_bfloat162. Is only being read.
b
- nv_bfloat162. Is only being read.
Returns

nv_bfloat162

  • The

    sum of a and b, with respect to saturation.

Description

Performs nv_bfloat162 vector add of inputs a and b, in round-to-nearest mode, and clamps the results to range [0.0, 1.0]. NaN results are flushed to +0.0.

__device__ ​ __nv_bfloat162 __hfma2 ( const __nv_bfloat162 a, const __nv_bfloat162 b, const __nv_bfloat162 c )
Performs nv_bfloat162 vector fused multiply-add in round-to-nearest-even mode.
Description

Performs nv_bfloat162 vector multiply on inputs a and b, then performs a nv_bfloat162 vector add of the result with c, rounding the result once in round-to-nearest-even mode.

__device__ ​ __nv_bfloat162 __hfma2_relu ( const __nv_bfloat162 a, const __nv_bfloat162 b, const __nv_bfloat162 c )
Performs nv_bfloat162 vector fused multiply-add in round-to-nearest-even mode with relu saturation.
Parameters
a
- nv_bfloat162. Is only being read.
b
- nv_bfloat162. Is only being read.
c
- nv_bfloat162. Is only being read.
Returns

nv_bfloat162

  • The

    result of elementwise fused multiply-add operation on vectors a, b, and c with relu saturation.

Description

Performs nv_bfloat162 vector multiply on inputs a and b, then performs a nv_bfloat162 vector add of the result with c, rounding the result once in round-to-nearest-even mode. Then negative result is clamped to 0. NaN result is converted to canonical NaN.

__device__ ​ __nv_bfloat162 __hfma2_sat ( const __nv_bfloat162 a, const __nv_bfloat162 b, const __nv_bfloat162 c )
Performs nv_bfloat162 vector fused multiply-add in round-to-nearest-even mode, with saturation to [0.0, 1.0].
Parameters
a
- nv_bfloat162. Is only being read.
b
- nv_bfloat162. Is only being read.
c
- nv_bfloat162. Is only being read.
Returns

nv_bfloat162

  • The

    result of elementwise fused multiply-add operation on vectors a, b, and c, with respect to saturation.

Description

Performs nv_bfloat162 vector multiply on inputs a and b, then performs a nv_bfloat162 vector add of the result with c, rounding the result once in round-to-nearest-even mode, and clamps the results to range [0.0, 1.0]. NaN results are flushed to +0.0.

__device__ ​ __nv_bfloat162 __hmul2 ( const __nv_bfloat162 a, const __nv_bfloat162 b )
Performs nv_bfloat162 vector multiplication in round-to-nearest-even mode.
Description

Performs nv_bfloat162 vector multiplication of inputs a and b, in round-to-nearest-even mode.

__device__ ​ __nv_bfloat162 __hmul2_sat ( const __nv_bfloat162 a, const __nv_bfloat162 b )
Performs nv_bfloat162 vector multiplication in round-to-nearest-even mode, with saturation to [0.0, 1.0].
Parameters
a
- nv_bfloat162. Is only being read.
b
- nv_bfloat162. Is only being read.
Returns

nv_bfloat162

  • The

    result of elementwise multiplication of vectors a and b, with respect to saturation.

Description

Performs nv_bfloat162 vector multiplication of inputs a and b, in round-to-nearest-even mode, and clamps the results to range [0.0, 1.0]. NaN results are flushed to +0.0.

__device__ ​ __nv_bfloat162 __hneg2 ( const __nv_bfloat162 a )
Negates both halves of the input nv_bfloat162 number and returns the result.
Description

Negates both halves of the input nv_bfloat162 number a and returns the result.

__device__ ​ __nv_bfloat162 __hsub2 ( const __nv_bfloat162 a, const __nv_bfloat162 b )
Performs nv_bfloat162 vector subtraction in round-to-nearest-even mode.
Description

Subtracts nv_bfloat162 input vector b from input vector a in round-to-nearest-even mode.

__device__ ​ __nv_bfloat162 __hsub2_sat ( const __nv_bfloat162 a, const __nv_bfloat162 b )
Performs nv_bfloat162 vector subtraction in round-to-nearest-even mode, with saturation to [0.0, 1.0].
Parameters
a
- nv_bfloat162. Is only being read.
b
- nv_bfloat162. Is only being read.
Returns

nv_bfloat162

  • The

    subtraction of vector b from a, with respect to saturation.

Description

Subtracts nv_bfloat162 input vector b from input vector a in round-to-nearest-even mode, and clamps the results to range [0.0, 1.0]. NaN results are flushed to +0.0.