CUDA Math API :: CUDA Toolkit Documentation

1.2.7. Bfloat162 Math Functions

[Bfloat16 Precision Intrinsics]

To use these functions, include the header file cuda_bf16.h in your program.

Functions

__device__ __nv_bfloat162 atomicAdd ( const __nv_bfloat162* address, const __nv_bfloat162 val ): Vector add val to the value stored at address in global or shared memory, and writes this value back to address. The atomicity of the add operation is guaranteed separately for each of the two nv_bfloat16 elements; the entire __nv_bfloat162 is not guaranteed to be atomic as a single 32-bit access.
__device__ __nv_bfloat162 h2ceil ( const __nv_bfloat162 h ): Calculate nv_bfloat162 vector ceiling of the input argument.
__device__ __nv_bfloat162 h2cos ( const __nv_bfloat162 a ): Calculates nv_bfloat162 vector cosine in round-to-nearest-even mode.
__device__ __nv_bfloat162 h2exp ( const __nv_bfloat162 a ): Calculates nv_bfloat162 vector exponential function in round-to-nearest mode.
__device__ __nv_bfloat162 h2exp10 ( const __nv_bfloat162 a ): Calculates nv_bfloat162 vector decimal exponential function in round-to-nearest-even mode.
__device__ __nv_bfloat162 h2exp2 ( const __nv_bfloat162 a ): Calculates nv_bfloat162 vector binary exponential function in round-to-nearest-even mode.
__device__ __nv_bfloat162 h2floor ( const __nv_bfloat162 h ): Calculate the largest integer less than or equal to h.
__device__ __nv_bfloat162 h2log ( const __nv_bfloat162 a ): Calculates nv_bfloat162 vector natural logarithm in round-to-nearest-even mode.
__device__ __nv_bfloat162 h2log10 ( const __nv_bfloat162 a ): Calculates nv_bfloat162 vector decimal logarithm in round-to-nearest-even mode.
__device__ __nv_bfloat162 h2log2 ( const __nv_bfloat162 a ): Calculates nv_bfloat162 vector binary logarithm in round-to-nearest-even mode.
__device__ __nv_bfloat162 h2rcp ( const __nv_bfloat162 a ): Calculates nv_bfloat162 vector reciprocal in round-to-nearest-even mode.
__device__ __nv_bfloat162 h2rint ( const __nv_bfloat162 h ): Round input to nearest integer value in nv_bfloat16 floating-point number.
__device__ __nv_bfloat162 h2rsqrt ( const __nv_bfloat162 a ): Calculates nv_bfloat162 vector reciprocal square root in round-to-nearest mode.
__device__ __nv_bfloat162 h2sin ( const __nv_bfloat162 a ): Calculates nv_bfloat162 vector sine in round-to-nearest-even mode.
__device__ __nv_bfloat162 h2sqrt ( const __nv_bfloat162 a ): Calculates nv_bfloat162 vector square root in round-to-nearest-even mode.
__device__ __nv_bfloat162 h2trunc ( const __nv_bfloat162 h ): Truncate nv_bfloat162 vector input argument to the integral part.

Functions

__device__ __nv_bfloat162 atomicAdd ( const __nv_bfloat162* address, const __nv_bfloat162 val )

Vector add val to the value stored at address in global or shared memory, and writes this value back to address. The atomicity of the add operation is guaranteed separately for each of the two nv_bfloat16 elements; the entire __nv_bfloat162 is not guaranteed to be atomic as a single 32-bit access.

Parameters

address: - __nv_bfloat162*. An address in global or shared memory.
val: - __nv_bfloat162. The value to be added.

Returns

__nv_bfloat162

The
old value read from address.

Description

The location of address must be in global or shared memory. This operation has undefined behavior otherwise. This operation is only supported by devices of compute capability 8.x and higher.

Note:

For more details for this function see the Atomic Functions section in the CUDA C++ Programming Guide.

__device__ __nv_bfloat162 h2ceil ( const __nv_bfloat162 h )

Calculate nv_bfloat162 vector ceiling of the input argument.

Parameters

h: - nv_bfloat162. Is only being read.

Returns

nv_bfloat162

The
vector of smallest integers not less than h.

Description

For each component of vector h compute the smallest integer value not less than h.

__device__ __nv_bfloat162 h2cos ( const __nv_bfloat162 a )

Calculates nv_bfloat162 vector cosine in round-to-nearest-even mode.

Parameters

a: - nv_bfloat162. Is only being read.

Returns

nv_bfloat162

The
elementwise cosine on vector a.

Description

Calculates nv_bfloat162 cosine of input vector a in round-to-nearest-even mode.

__device__ __nv_bfloat162 h2exp ( const __nv_bfloat162 a )

Calculates nv_bfloat162 vector exponential function in round-to-nearest mode.

Parameters

a: - nv_bfloat162. Is only being read.

Returns

nv_bfloat162

The
elementwise exponential function on vector a.

Description

Calculates nv_bfloat162 exponential function of input vector a in round-to-nearest-even mode.

__device__ __nv_bfloat162 h2exp10 ( const __nv_bfloat162 a )

Calculates nv_bfloat162 vector decimal exponential function in round-to-nearest-even mode.

Parameters

a: - nv_bfloat162. Is only being read.

Returns

nv_bfloat162

The
elementwise decimal exponential function on vector a.

Description

Calculates nv_bfloat162 decimal exponential function of input vector a in round-to-nearest-even mode.

__device__ __nv_bfloat162 h2exp2 ( const __nv_bfloat162 a )

Calculates nv_bfloat162 vector binary exponential function in round-to-nearest-even mode.

Parameters

a: - nv_bfloat162. Is only being read.

Returns

nv_bfloat162

The
elementwise binary exponential function on vector a.

Description

Calculates nv_bfloat162 binary exponential function of input vector a in round-to-nearest-even mode.

__device__ __nv_bfloat162 h2floor ( const __nv_bfloat162 h )

Calculate the largest integer less than or equal to h.

Parameters

h: - nv_bfloat162. Is only being read.

Returns

nv_bfloat162

The
vector of largest integers which is less than or equal to h.

Description

For each component of vector h calculate the largest integer value which is less than or equal to h.

__device__ __nv_bfloat162 h2log ( const __nv_bfloat162 a )

Calculates nv_bfloat162 vector natural logarithm in round-to-nearest-even mode.

Parameters

a: - nv_bfloat162. Is only being read.

Returns

nv_bfloat162

The
elementwise natural logarithm on vector a.

Description

Calculates nv_bfloat162 natural logarithm of input vector a in round-to-nearest-even mode.

__device__ __nv_bfloat162 h2log10 ( const __nv_bfloat162 a )

Calculates nv_bfloat162 vector decimal logarithm in round-to-nearest-even mode.

Parameters

a: - nv_bfloat162. Is only being read.

Returns

nv_bfloat162

The
elementwise decimal logarithm on vector a.

Description

Calculates nv_bfloat162 decimal logarithm of input vector a in round-to-nearest-even mode.

__device__ __nv_bfloat162 h2log2 ( const __nv_bfloat162 a )

Calculates nv_bfloat162 vector binary logarithm in round-to-nearest-even mode.

Parameters

a: - nv_bfloat162. Is only being read.

Returns

nv_bfloat162

The
elementwise binary logarithm on vector a.

Description

Calculates nv_bfloat162 binary logarithm of input vector a in round-to-nearest mode.

__device__ __nv_bfloat162 h2rcp ( const __nv_bfloat162 a )

Calculates nv_bfloat162 vector reciprocal in round-to-nearest-even mode.

Parameters

a: - nv_bfloat162. Is only being read.

Returns

nv_bfloat162

The
elementwise reciprocal on vector a.

Description

Calculates nv_bfloat162 reciprocal of input vector a in round-to-nearest-even mode.

__device__ __nv_bfloat162 h2rint ( const __nv_bfloat162 h )

Round input to nearest integer value in nv_bfloat16 floating-point number.

Parameters

h: - nv_bfloat162. Is only being read.

Returns

nv_bfloat162

The
vector of rounded integer values.

Description

Round each component of nv_bfloat162 vector h to the nearest integer value in nv_bfloat16 floating-point format, with bfloat16way cases rounded to the nearest even integer value.

__device__ __nv_bfloat162 h2rsqrt ( const __nv_bfloat162 a )

Calculates nv_bfloat162 vector reciprocal square root in round-to-nearest mode.

Parameters

a: - nv_bfloat162. Is only being read.

Returns

nv_bfloat162

The
elementwise reciprocal square root on vector a.

Description

Calculates nv_bfloat162 reciprocal square root of input vector a in round-to-nearest-even mode.

__device__ __nv_bfloat162 h2sin ( const __nv_bfloat162 a )

Calculates nv_bfloat162 vector sine in round-to-nearest-even mode.

Parameters

a: - nv_bfloat162. Is only being read.

Returns

nv_bfloat162

The
elementwise sine on vector a.

Description

Calculates nv_bfloat162 sine of input vector a in round-to-nearest-even mode.

__device__ __nv_bfloat162 h2sqrt ( const __nv_bfloat162 a )

Calculates nv_bfloat162 vector square root in round-to-nearest-even mode.

Parameters

a: - nv_bfloat162. Is only being read.

Returns

nv_bfloat162

The
elementwise square root on vector a.

Description

Calculates nv_bfloat162 square root of input vector a in round-to-nearest mode.

__device__ __nv_bfloat162 h2trunc ( const __nv_bfloat162 h )

Truncate nv_bfloat162 vector input argument to the integral part.

Parameters

h: - nv_bfloat162. Is only being read.

Returns

nv_bfloat162

The
truncated h.

Description

Round each component of vector h to the nearest integer value that does not exceed h in magnitude.