Math Operations

The APIs in this section provide elementwise mathematical computations on tile like arguments. Each function provides an approximation of an infinitely precise mathematical result.

The behavior of these APIs is not yet exhaustively specified and may not follow IEEE 754 semantics. Information about the error bounds, rounding modes, and the behavior of non-finite, signed zero, and subnormal arguments may be absent.

Warning

Some of the mathematical APIs may be modified to accept rounding mode and subnormals rounding mode template arguments in future releases. This change may break existing code that uses explicit template arguments in math API invocations.

To maintain compatibility, avoid specifying an explicit template argument when invoking the math functions:

ct::exp(0.0);         // Preferred syntax
ct::exp<double>(0.0); // Discouraged syntax

cuda::tiles::ceil

template<ct::basic_floating_point_tile Tile>
__tile__ Tile ceil(Tile in) noexcept;

Performs elementwise ceiling on the operand in. For each element \(x\) in operand in, the result of the computation is \(\lceil x \rceil\).

cuda::tiles::floor

template<ct::basic_floating_point_tile Tile>
__tile__ Tile floor(Tile in) noexcept;

Performs elementwise floor on the operand in. For each element \(x\) in operand in, the result of the computation is \(\lfloor x \rfloor\).

cuda::tiles::pow

template<ct::basic_floating_point_tile B, ct::basic_floating_point_tile E>
requires ct::arithmetic_tile_convertible<B, E>
__tile__ ct::arithmetic_tile_conversion_t<B, E> pow(B base, E exponent) noexcept;

Performs elementwise exponentiation on the arithmetic tile converted operands base and exponent.

Let \(a\) and \(b\) be corresponding elements of the converted operands base and exponent respectively.

The result of each computation is \(a^b\).

cuda::tiles::exp2

template<
ct::subnormals_rounding_mode SubMode = ct::default_subnormals_rounding_mode(),
ct::basic_floating_point_tile Tile
>
requires /* atomic constraint */
__tile__ Tile exp2(Tile in, ct::subnormals_rounding_mode_constant<SubMode> = {}) noexcept;

Performs elementwise base two exponentiation of operand in.

For each element \(x\) in operand in, the result of the computation is

\[\operatorname{subround}(2^{\operatorname{subround}(x)})\]

where \(\operatorname{subround}\) applies a subnormals rounding mode as determined by the SubMode template argument.

The atomic constraint validates that:

  1. If SubMode is round subnormals to zero, then \(T\) is float.

  2. The value SubMode is an enumerator of ct::subnormals_rounding_mode.

cuda::tiles::exp

template<ct::basic_floating_point_tile Tile>
__tile__ Tile exp(Tile in) noexcept;

Performs elementwise base \(e\) exponentiation on operand in. For each element \(x\) of in, the result of the computation is \(e^x\).

cuda::tiles::log2

template<ct::basic_floating_point_tile Tile>
__tile__ Tile log2(Tile in) noexcept;

Performs elementwise base \(2\) logarithm on operand in. For each element \(x\) of in, the result of the computation is \(\operatorname{log}_2(x)\).

cuda::tiles::log

template<ct::basic_floating_point_tile Tile>
__tile__ Tile log(Tile in) noexcept;

Performs elementwise natural logarithm on operand in. For each element \(x\) of in, the result of the computation is \(\operatorname{ln}(x)\).

cuda::tiles::sqrt

template<
ct::rounding_mode Mode = ct::default_rounding_mode(),
ct::subnormals_rounding_mode SubMode = ct::default_subnormals_rounding_mode(),
ct::basic_floating_point_tile Tile
>
requires /* atomic constraint */
__tile__ Tile sqrt(Tile x, ct::rounding_mode_constant<Mode> = {}, ct::subnormals_rounding_mode_constant<SubMode> = {}) noexcept;

Performs elementwise square root on operand in.

For each element \(x\) of in, the result of the computation is

\[\operatorname{subround}\left(\sqrt{\operatorname{subround}(x)}\right)\]

where \(\operatorname{subround}\) applies a subnormals rounding mode as specified by SubMode.

The atomic constraint validates that:

  1. Mode is a precise rounding mode or round approximate.

  2. If Mode is round approximate, then \(T\) is float.

  3. If SubMode is round subnormals to zero, then \(T\) is float.

  4. The value Mode and SubMode are enumerators of their respective types.

cuda::tiles::rsqrt

template<
ct::subnormals_rounding_mode SubMode = ct::default_subnormals_rounding_mode(),
ct::basic_floating_point_tile Tile
>
requires /* atomic constraint */
__tile__ Tile rsqrt(Tile in, ct::subnormals_rounding_mode_constant<SubMode> = {}) noexcept;

Performs elementwise reciprocal square root on operand in. For each element \(x\) of in, the result of the computation is

\[\operatorname{subround}\left(\frac{1}{\sqrt{\operatorname{subround}(x)}}\right)\]

where \(\operatorname{subround}\) applies a subnormals rounding mode as specified by SubMode.

The atomic constraint validates that:

  1. If SubMode is round subnormals to zero, then \(T\) is float.

  2. The value SubMode is an enumerator of ct::subnormals_rounding_mode.

cuda::tiles::cosh

template<ct::basic_floating_point_tile Tile>
__tile__ Tile cosh(Tile in) noexcept;

Performs elementwise hyperbolic cosine on the input in. For each element \(x\) of in, the result of the computation is \(\operatorname{cosh}(x)\).

cuda::tiles::cos

template<ct::basic_floating_point_tile Tile>
__tile__ Tile cos(Tile in) noexcept;

Performs elementwise cosine on the input in. For each element \(x\) of in, the result of the computation is \(\operatorname{cos}(x)\).

cuda::tiles::sinh

template<ct::basic_floating_point_tile>
__tile__ Tile sinh(Tile in) noexcept;

Performs elementwise hyperbolic sine on the input in. For each element \(x\) of in, the result of the computation is \(\operatorname{sinh}(x)\).

cuda::tiles::sin

template<ct::basic_floating_point_tile>
__tile__ Tile sin(Tile in) noexcept;

Performs elementwise sine on the input in. For each element \(x\) of in, the result of the computation is \(\operatorname{sin}(x)\).

cuda::tiles::tanh

template<ct::basic_floating_point_tile>
__tile__ Tile tanh(Tile in) noexcept;

Performs elementwise hyperbolic tangent on the input in. For each element \(x\) of in, the result of the computation is \(\operatorname{tanh}(x)\).

cuda::tiles::tan

template<ct::basic_floating_point_tile>
__tile__ Tile tan(Tile in) noexcept;

Performs elementwise tangent on the input in. For each element \(x\) of in, the result of the computation is \(\operatorname{tan}(x)\).

cuda::tiles::atan2

template<ct::arithmetic_tile Lhs, ct::arithmetic_tile Rhs>
requires ct::arithmetic_tile_convertible<Lhs, Rhs> && ct::basic_floating_point_tile<ct::arithmetic_tile_conversion_t<Lhs, Rhs>>
__tile__ ct::arithmetic_tile_conversion_t<Lhs, Rhs> atan2(Lhs y, Rhs x) noexcept;

Performs elementwise arctangent for ratio of the arithmetic tile converted operands y and x.

Let \(a\) and \(b\) be corresponding elements of the converted operands y and x respectively. The result of each computation is the angle between the positive x axis and the ray connecting the origin and the coordinate \((a, b)\).