CUDA Math API :: CUDA Toolkit Documentation

1.5. Single Precision Intrinsics

This section describes single precision intrinsic functions that are only supported in device code.

Functions

__device__ float __cosf ( float x ): Calculate the fast approximate cosine of the input argument.
__device__ float __exp10f ( float x ): Calculate the fast approximate base 10 exponential of the input argument.
__device__ float __expf ( float x ): Calculate the fast approximate base $e$ exponential of the input argument.
__device__ float __fadd_rd ( float x, float y ): Add two floating point values in round-down mode.
__device__ float __fadd_rn ( float x, float y ): Add two floating point values in round-to-nearest-even mode.
__device__ float __fadd_ru ( float x, float y ): Add two floating point values in round-up mode.
__device__ float __fadd_rz ( float x, float y ): Add two floating point values in round-towards-zero mode.
__device__ float __fdiv_rd ( float x, float y ): Divide two floating point values in round-down mode.
__device__ float __fdiv_rn ( float x, float y ): Divide two floating point values in round-to-nearest-even mode.
__device__ float __fdiv_ru ( float x, float y ): Divide two floating point values in round-up mode.
__device__ float __fdiv_rz ( float x, float y ): Divide two floating point values in round-towards-zero mode.
__device__ float __fdividef ( float x, float y ): Calculate the fast approximate division of the input arguments.
__device__ float __fmaf_rd ( float x, float y, float z ): Compute $x \times y + z$ as a single operation, in round-down mode.
__device__ float __fmaf_rn ( float x, float y, float z ): Compute $x \times y + z$ as a single operation, in round-to-nearest-even mode.
__device__ float __fmaf_ru ( float x, float y, float z ): Compute $x \times y + z$ as a single operation, in round-up mode.
__device__ float __fmaf_rz ( float x, float y, float z ): Compute $x \times y + z$ as a single operation, in round-towards-zero mode.
__device__ float __fmul_rd ( float x, float y ): Multiply two floating point values in round-down mode.
__device__ float __fmul_rn ( float x, float y ): Multiply two floating point values in round-to-nearest-even mode.
__device__ float __fmul_ru ( float x, float y ): Multiply two floating point values in round-up mode.
__device__ float __fmul_rz ( float x, float y ): Multiply two floating point values in round-towards-zero mode.
__device__ float __frcp_rd ( float x ): Compute $\frac{1}{x}$ in round-down mode.
__device__ float __frcp_rn ( float x ): Compute $\frac{1}{x}$ in round-to-nearest-even mode.
__device__ float __frcp_ru ( float x ): Compute $\frac{1}{x}$ in round-up mode.
__device__ float __frcp_rz ( float x ): Compute $\frac{1}{x}$ in round-towards-zero mode.
__device__ float __frsqrt_rn ( float x ): Compute $1 / \sqrt{x}$ in round-to-nearest-even mode.
__device__ float __fsqrt_rd ( float x ): Compute $\sqrt{x}$ in round-down mode.
__device__ float __fsqrt_rn ( float x ): Compute $\sqrt{x}$ in round-to-nearest-even mode.
__device__ float __fsqrt_ru ( float x ): Compute $\sqrt{x}$ in round-up mode.
__device__ float __fsqrt_rz ( float x ): Compute $\sqrt{x}$ in round-towards-zero mode.
__device__ float __fsub_rd ( float x, float y ): Subtract two floating point values in round-down mode.
__device__ float __fsub_rn ( float x, float y ): Subtract two floating point values in round-to-nearest-even mode.
__device__ float __fsub_ru ( float x, float y ): Subtract two floating point values in round-up mode.
__device__ float __fsub_rz ( float x, float y ): Subtract two floating point values in round-towards-zero mode.
__device__ float __log10f ( float x ): Calculate the fast approximate base 10 logarithm of the input argument.
__device__ float __log2f ( float x ): Calculate the fast approximate base 2 logarithm of the input argument.
__device__ float __logf ( float x ): Calculate the fast approximate base $e$ logarithm of the input argument.
__device__ float __powf ( float x, float y ): Calculate the fast approximate of $x^{y}$ .
__device__ float __saturatef ( float x ): Clamp the input argument to [+0.0, 1.0].
__device__ void __sincosf ( float x, float* sptr, float* cptr ): Calculate the fast approximate of sine and cosine of the first input argument.
__device__ float __sinf ( float x ): Calculate the fast approximate sine of the input argument.
__device__ float __tanf ( float x ): Calculate the fast approximate tangent of the input argument.

Functions

__device__ float __cosf ( float x )

Calculate the fast approximate cosine of the input argument.

Returns

Returns the approximate cosine of x.

Description

Calculate the fast approximate cosine of the input argument x, measured in radians.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.2, Table 9.
Input and output in the denormal range is flushed to sign preserving 0.0.

__device__ float __exp10f ( float x )

Calculate the fast approximate base 10 exponential of the input argument.

Returns

Returns an approximation to $10^{x}$ .

Description

Calculate the fast approximate base 10 exponential of the input argument x, $10^{x}$ .

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.2, Table 9.
Most input and output values around denormal range are flushed to sign preserving 0.0.

__device__ float __expf ( float x )

Calculate the fast approximate base

e

exponential of the input argument.

Returns

Returns an approximation to $e^{x}$ .

Description

Calculate the fast approximate base $e$ exponential of the input argument x, $e^{x}$ .

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.2, Table 9.
Most input and output values around denormal range are flushed to sign preserving 0.0.

__device__ float __fadd_rd ( float x, float y )

Add two floating point values in round-down mode.

Returns

Returns x + y.

Description

Compute the sum of x and y in round-down (to negative infinity) mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.1, Table 6.
This operation will never be merged into a single multiply-add instruction.

__device__ float __fadd_rn ( float x, float y )

Add two floating point values in round-to-nearest-even mode.

Returns

Returns x + y.

Description

Compute the sum of x and y in round-to-nearest-even rounding mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.1, Table 6.
This operation will never be merged into a single multiply-add instruction.

__device__ float __fadd_ru ( float x, float y )

Add two floating point values in round-up mode.

Returns

Returns x + y.

Description

Compute the sum of x and y in round-up (to positive infinity) mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.1, Table 6.
This operation will never be merged into a single multiply-add instruction.

__device__ float __fadd_rz ( float x, float y )

Add two floating point values in round-towards-zero mode.

Returns

Returns x + y.

Description

Compute the sum of x and y in round-towards-zero mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.1, Table 6.
This operation will never be merged into a single multiply-add instruction.

__device__ float __fdiv_rd ( float x, float y )

Divide two floating point values in round-down mode.

Returns

Returns x / y.

Description

Divide two floating point values x by y in round-down (to negative infinity) mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.1, Table 6.

__device__ float __fdiv_rn ( float x, float y )

Divide two floating point values in round-to-nearest-even mode.

Returns

Returns x / y.

Description

Divide two floating point values x by y in round-to-nearest-even mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.1, Table 6.

__device__ float __fdiv_ru ( float x, float y )

Divide two floating point values in round-up mode.

Returns

Returns x / y.

Description

Divide two floating point values x by y in round-up (to positive infinity) mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.1, Table 6.

__device__ float __fdiv_rz ( float x, float y )

Divide two floating point values in round-towards-zero mode.

Returns

Returns x / y.

Description

Divide two floating point values x by y in round-towards-zero mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.1, Table 6.

__device__ float __fdividef ( float x, float y )

Calculate the fast approximate division of the input arguments.

Returns

Returns x / y.

__fdividef( $\infty$ , y) returns NaN for $2^{126} < y < 2^{128}$ .
__fdividef(x, y) returns 0 for $2^{126} < y < 2^{128}$ and $x \neq \infty$ .

Description

Calculate the fast approximate division of x by y.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.2, Table 9.

__device__ float __fmaf_rd ( float x, float y, float z )

Compute

x \times y + z

as a single operation, in round-down mode.

Returns

Returns the rounded value of $x \times y + z$ as a single operation.

fmaf( $\pm \infty$ , $\pm 0$ , z) returns NaN.
fmaf( $\pm 0$ , $\pm \infty$ , z) returns NaN.
fmaf(x, y, $- \infty$ ) returns NaN if $x \times y$ is an exact $+ \infty$ .
fmaf(x, y, $+ \infty$ ) returns NaN if $x \times y$ is an exact $- \infty$ .

Description

Computes the value of $x \times y + z$ as a single ternary operation, rounding the result once in round-down (to negative infinity) mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.1, Table 6.

__device__ float __fmaf_rn ( float x, float y, float z )

Compute

x \times y + z

as a single operation, in round-to-nearest-even mode.

Returns

Returns the rounded value of $x \times y + z$ as a single operation.

fmaf( $\pm \infty$ , $\pm 0$ , z) returns NaN.
fmaf( $\pm 0$ , $\pm \infty$ , z) returns NaN.
fmaf(x, y, $- \infty$ ) returns NaN if $x \times y$ is an exact $+ \infty$ .
fmaf(x, y, $+ \infty$ ) returns NaN if $x \times y$ is an exact $- \infty$ .

Description

Computes the value of $x \times y + z$ as a single ternary operation, rounding the result once in round-to-nearest-even mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.1, Table 6.

__device__ float __fmaf_ru ( float x, float y, float z )

Compute

x \times y + z

as a single operation, in round-up mode.

Returns

Returns the rounded value of $x \times y + z$ as a single operation.

fmaf( $\pm \infty$ , $\pm 0$ , z) returns NaN.
fmaf( $\pm 0$ , $\pm \infty$ , z) returns NaN.
fmaf(x, y, $- \infty$ ) returns NaN if $x \times y$ is an exact $+ \infty$ .
fmaf(x, y, $+ \infty$ ) returns NaN if $x \times y$ is an exact $- \infty$ .

Description

Computes the value of $x \times y + z$ as a single ternary operation, rounding the result once in round-up (to positive infinity) mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.1, Table 6.

__device__ float __fmaf_rz ( float x, float y, float z )

Compute

x \times y + z

as a single operation, in round-towards-zero mode.

Returns

Returns the rounded value of $x \times y + z$ as a single operation.

fmaf( $\pm \infty$ , $\pm 0$ , z) returns NaN.
fmaf( $\pm 0$ , $\pm \infty$ , z) returns NaN.
fmaf(x, y, $- \infty$ ) returns NaN if $x \times y$ is an exact $+ \infty$ .
fmaf(x, y, $+ \infty$ ) returns NaN if $x \times y$ is an exact $- \infty$ .

Description

Computes the value of $x \times y + z$ as a single ternary operation, rounding the result once in round-towards-zero mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.1, Table 6.

__device__ float __fmul_rd ( float x, float y )

Multiply two floating point values in round-down mode.

Returns

Returns x * y.

Description

Compute the product of x and y in round-down (to negative infinity) mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.1, Table 6.
This operation will never be merged into a single multiply-add instruction.

__device__ float __fmul_rn ( float x, float y )

Multiply two floating point values in round-to-nearest-even mode.

Returns

Returns x * y.

Description

Compute the product of x and y in round-to-nearest-even mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.1, Table 6.
This operation will never be merged into a single multiply-add instruction.

__device__ float __fmul_ru ( float x, float y )

Multiply two floating point values in round-up mode.

Returns

Returns x * y.

Description

Compute the product of x and y in round-up (to positive infinity) mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.1, Table 6.
This operation will never be merged into a single multiply-add instruction.

__device__ float __fmul_rz ( float x, float y )

Multiply two floating point values in round-towards-zero mode.

Returns

Returns x * y.

Description

Compute the product of x and y in round-towards-zero mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.1, Table 6.
This operation will never be merged into a single multiply-add instruction.

__device__ float __frcp_rd ( float x )

Compute

\frac{1}{x}

in round-down mode.

Returns

Returns $\frac{1}{x}$ .

Description

Compute the reciprocal of x in round-down (to negative infinity) mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.1, Table 6.

__device__ float __frcp_rn ( float x )

Compute

\frac{1}{x}

in round-to-nearest-even mode.

Returns

Returns $\frac{1}{x}$ .

Description

Compute the reciprocal of x in round-to-nearest-even mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.1, Table 6.

__device__ float __frcp_ru ( float x )

Compute

\frac{1}{x}

in round-up mode.

Returns

Returns $\frac{1}{x}$ .

Description

Compute the reciprocal of x in round-up (to positive infinity) mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.1, Table 6.

__device__ float __frcp_rz ( float x )

Compute

\frac{1}{x}

in round-towards-zero mode.

Returns

Returns $\frac{1}{x}$ .

Description

Compute the reciprocal of x in round-towards-zero mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.1, Table 6.

__device__ float __frsqrt_rn ( float x )

Compute

1 / \sqrt{x}

in round-to-nearest-even mode.

Returns

Returns $1 / \sqrt{x}$ .

Description

Compute the reciprocal square root of x in round-to-nearest-even mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.1, Table 6.

__device__ float __fsqrt_rd ( float x )

Compute

\sqrt{x}

in round-down mode.

Returns

Returns $\sqrt{x}$ .

Description

Compute the square root of x in round-down (to negative infinity) mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.1, Table 6.

__device__ float __fsqrt_rn ( float x )

Compute

\sqrt{x}

in round-to-nearest-even mode.

Returns

Returns $\sqrt{x}$ .

Description

Compute the square root of x in round-to-nearest-even mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.1, Table 6.

__device__ float __fsqrt_ru ( float x )

Compute

\sqrt{x}

in round-up mode.

Returns

Returns $\sqrt{x}$ .

Description

Compute the square root of x in round-up (to positive infinity) mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.1, Table 6.

__device__ float __fsqrt_rz ( float x )

Compute

\sqrt{x}

in round-towards-zero mode.

Returns

Returns $\sqrt{x}$ .

Description

Compute the square root of x in round-towards-zero mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.1, Table 6.

__device__ float __fsub_rd ( float x, float y )

Subtract two floating point values in round-down mode.

Returns

Returns x - y.

Description

Compute the difference of x and y in round-down (to negative infinity) mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.1, Table 6.
This operation will never be merged into a single multiply-add instruction.

__device__ float __fsub_rn ( float x, float y )

Subtract two floating point values in round-to-nearest-even mode.

Returns

Returns x - y.

Description

Compute the difference of x and y in round-to-nearest-even rounding mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.1, Table 6.
This operation will never be merged into a single multiply-add instruction.

__device__ float __fsub_ru ( float x, float y )

Subtract two floating point values in round-up mode.

Returns

Returns x - y.

Description

Compute the difference of x and y in round-up (to positive infinity) mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.1, Table 6.
This operation will never be merged into a single multiply-add instruction.

__device__ float __fsub_rz ( float x, float y )

Subtract two floating point values in round-towards-zero mode.

Returns

Returns x - y.

Description

Compute the difference of x and y in round-towards-zero mode.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.1, Table 6.
This operation will never be merged into a single multiply-add instruction.

__device__ float __log10f ( float x )

Calculate the fast approximate base 10 logarithm of the input argument.

Returns

Returns an approximation to $\log_{10} (x)$ .

Description

Calculate the fast approximate base 10 logarithm of the input argument x.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.2, Table 9.
Most input and output values around denormal range are flushed to sign preserving 0.0.

__device__ float __log2f ( float x )

Calculate the fast approximate base 2 logarithm of the input argument.

Returns

Returns an approximation to $\log_{2} (x)$ .

Description

Calculate the fast approximate base 2 logarithm of the input argument x.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.2, Table 9.
Input and output in the denormal range is flushed to sign preserving 0.0.

__device__ float __logf ( float x )

Calculate the fast approximate base

e

logarithm of the input argument.

Returns

Returns an approximation to $\log_{e} (x)$ .

Description

Calculate the fast approximate base $e$ logarithm of the input argument x.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.2, Table 9.
Most input and output values around denormal range are flushed to sign preserving 0.0.

__device__ float __powf ( float x, float y )

Calculate the fast approximate of

x^{y}

Returns

Returns an approximation to $x^{y}$ .

Description

Calculate the fast approximate of x, the first input argument, raised to the power of y, the second input argument, $x^{y}$ .

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.2, Table 9.
Most input and output values around denormal range are flushed to sign preserving 0.0.

__device__ float __saturatef ( float x )

Clamp the input argument to [+0.0, 1.0].

Returns

__saturatef(x) returns 0 if x < 0.
__saturatef(x) returns 1 if x > 1.
__saturatef(x) returns x if $0 \leq x \leq 1$ .
__saturatef(NaN) returns 0.

Description

Clamp the input argument x to be within the interval [+0.0, 1.0].

__device__ void __sincosf ( float x, float* sptr, float* cptr )

Calculate the fast approximate of sine and cosine of the first input argument.

Returns

none

Description

Calculate the fast approximate of sine and cosine of the first input argument x (measured in radians). The results for sine and cosine are written into the second argument, sptr, and, respectively, third argument, cptr.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.2, Table 9.
Denorm input/output is flushed to sign preserving 0.0.

__device__ float __sinf ( float x )

Calculate the fast approximate sine of the input argument.

Returns

Returns the approximate sine of x.

Description

Calculate the fast approximate sine of the input argument x, measured in radians.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.2, Table 9.
Input and output in the denormal range is flushed to sign preserving 0.0.

__device__ float __tanf ( float x )

Calculate the fast approximate tangent of the input argument.

Returns

Returns the approximate tangent of x.

Description

Calculate the fast approximate tangent of the input argument x, measured in radians.

Note:

For accuracy information for this function see the CUDA C Programming Guide, Appendix D.2, Table 9.
The result is computed as the fast divide of __sinf() by __cosf(). Denormal input and output are flushed to sign-preserving 0.0 at each step of the computation.