1.1.1. Half Arithmetic Functions
[Half Precision Intrinsics]
To use these functions, include the header file cuda_fp16.h in your program.
Functions
 __device__ __half __habs ( const __half a )
 Calculates the absolute value of input half number and returns the result.
 __device__ __half __hadd ( const __half a, const __half b )
 Performs half addition in roundtonearesteven mode.
 __device__ __half __hadd_rn ( const __half a, const __half b )
 Performs half addition in roundtonearesteven mode.
 __device__ __half __hadd_sat ( const __half a, const __half b )
 Performs half addition in roundtonearesteven mode, with saturation to [0.0, 1.0].
 __device__ __half __hdiv ( const __half a, const __half b )
 Performs half division in roundtonearesteven mode.
 __device__ __half __hfma ( const __half a, const __half b, const __half c )
 Performs half fused multiplyadd in roundtonearesteven mode.
 __device__ __half __hfma_relu ( const __half a, const __half b, const __half c )
 Performs half fused multiplyadd in roundtonearesteven mode with relu saturation.
 __device__ __half __hfma_sat ( const __half a, const __half b, const __half c )
 Performs half fused multiplyadd in roundtonearesteven mode, with saturation to [0.0, 1.0].
 __device__ __half __hmul ( const __half a, const __half b )
 Performs half multiplication in roundtonearesteven mode.
 __device__ __half __hmul_rn ( const __half a, const __half b )
 Performs half multiplication in roundtonearesteven mode.
 __device__ __half __hmul_sat ( const __half a, const __half b )
 Performs half multiplication in roundtonearesteven mode, with saturation to [0.0, 1.0].
 __device__ __half __hneg ( const __half a )
 Negates input half number and returns the result.
 __device__ __half __hsub ( const __half a, const __half b )
 Performs half subtraction in roundtonearesteven mode.
 __device__ __half __hsub_rn ( const __half a, const __half b )
 Performs half subtraction in roundtonearesteven mode.
 __device__ __half __hsub_sat ( const __half a, const __half b )
 Performs half subtraction in roundtonearesteven mode, with saturation to [0.0, 1.0].
 __device__ __half atomicAdd ( const __half* address, const __half val )
 Adds val to the value stored at address in global or shared memory, and writes this value back to address. This operation is performed in one atomic operation.
Functions
 __device__ __half __habs ( const __half a )

Calculates the absolute value of input half number and returns the result.
Parameters
 a
  half. Is only being read.
Returns
half
 The
absolute value of a.
Description
Calculates the absolute value of input half number and returns the result.
 __device__ __half __hadd ( const __half a, const __half b )

Performs half addition in roundtonearesteven mode.
Description
Performs half addition of inputs a and b, in roundtonearesteven mode.
 __device__ __half __hadd_rn ( const __half a, const __half b )

Performs half addition in roundtonearesteven mode.
Description
Performs half addition of inputs a and b, in roundtonearesteven mode. Prevents floatingpoint contractions of mul+add into fma.
 __device__ __half __hadd_sat ( const __half a, const __half b )

Performs half addition in roundtonearesteven mode, with saturation to [0.0, 1.0].
Parameters
 a
  half. Is only being read.
 b
  half. Is only being read.
Returns
half
 The
sum of a and b, with respect to saturation.
Description
Performs half add of inputs a and b, in roundtonearesteven mode, and clamps the result to range [0.0, 1.0]. NaN results are flushed to +0.0.
 __device__ __half __hdiv ( const __half a, const __half b )

Performs half division in roundtonearesteven mode.
Description
Divides half input a by input b in roundtonearest mode.
 __device__ __half __hfma ( const __half a, const __half b, const __half c )

Performs half fused multiplyadd in roundtonearesteven mode.
Description
Performs half multiply on inputs a and b, then performs a half add of the result with c, rounding the result once in roundtonearesteven mode.
 __device__ __half __hfma_relu ( const __half a, const __half b, const __half c )

Performs half fused multiplyadd in roundtonearesteven mode with relu saturation.
Parameters
 a
  half. Is only being read.
 b
  half. Is only being read.
 c
  half. Is only being read.
Returns
half
 The
result of fused multiplyadd operation on a, b, and c with relu saturation.
Description
Performs half multiply on inputs a and b, then performs a half add of the result with c, rounding the result once in roundtonearesteven mode. Then negative result is clamped to 0. NaN result is converted to canonical NaN.
 __device__ __half __hfma_sat ( const __half a, const __half b, const __half c )

Performs half fused multiplyadd in roundtonearesteven mode, with saturation to [0.0, 1.0].
Parameters
 a
  half. Is only being read.
 b
  half. Is only being read.
 c
  half. Is only being read.
Returns
half
 The
result of fused multiplyadd operation on a, b, and c, with respect to saturation.
Description
Performs half multiply on inputs a and b, then performs a half add of the result with c, rounding the result once in roundtonearesteven mode, and clamps the result to range [0.0, 1.0]. NaN results are flushed to +0.0.
 __device__ __half __hmul ( const __half a, const __half b )

Performs half multiplication in roundtonearesteven mode.
Description
Performs half multiplication of inputs a and b, in roundtonearest mode.
 __device__ __half __hmul_rn ( const __half a, const __half b )

Performs half multiplication in roundtonearesteven mode.
Description
Performs half multiplication of inputs a and b, in roundtonearest mode. Prevents floatingpoint contractions of mul+add or sub into fma.
 __device__ __half __hmul_sat ( const __half a, const __half b )

Performs half multiplication in roundtonearesteven mode, with saturation to [0.0, 1.0].
Parameters
 a
  half. Is only being read.
 b
  half. Is only being read.
Returns
half
 The
result of multiplying a and b, with respect to saturation.
Description
Performs half multiplication of inputs a and b, in roundtonearest mode, and clamps the result to range [0.0, 1.0]. NaN results are flushed to +0.0.
 __device__ __half __hneg ( const __half a )

Negates input half number and returns the result.
Description
Negates input half number and returns the result.
 __device__ __half __hsub ( const __half a, const __half b )

Performs half subtraction in roundtonearesteven mode.
Description
Subtracts half input b from input a in roundtonearest mode.
 __device__ __half __hsub_rn ( const __half a, const __half b )

Performs half subtraction in roundtonearesteven mode.
Description
Subtracts half input b from input a in roundtonearest mode. Prevents floatingpoint contractions of mul+sub into fma.
 __device__ __half __hsub_sat ( const __half a, const __half b )

Performs half subtraction in roundtonearesteven mode, with saturation to [0.0, 1.0].
Parameters
 a
  half. Is only being read.
 b
  half. Is only being read.
Returns
half
 The
result of subtraction of b from a, with respect to saturation.
Description
Subtracts half input b from input a in roundtonearest mode, and clamps the result to range [0.0, 1.0]. NaN results are flushed to +0.0.
 __device__ __half atomicAdd ( const __half* address, const __half val )

Adds val to the value stored at address in global or shared memory, and writes this value back to address. This operation is performed in one atomic operation.
Parameters
 address
  half*. An address in global or shared memory.
 val
  half. The value to be added.
Returns
half
 The
old value read from address.
Description
The location of address must be in global or shared memory. This operation has undefined behavior otherwise. This operation is only supported by devices of compute capability 7.x and higher.
Note:For more details for this function see the Atomic Functions section in the CUDA C++ Programming Guide.