Math Operations
The APIs in this section provide elementwise mathematical computations on tile like arguments. Each function provides an approximation of an infinitely precise mathematical result.
The behavior of these APIs is not yet exhaustively specified and may not follow IEEE 754 semantics. Information about the error bounds, rounding modes, and the behavior of non-finite, signed zero, and subnormal arguments may be absent.
Warning
Some of the mathematical APIs may be modified to accept rounding mode and subnormals rounding mode template arguments in future releases. This change may break existing code that uses explicit template arguments in math API invocations.
To maintain compatibility, avoid specifying an explicit template argument when invoking the math functions:
ct::exp(0.0); // Preferred syntax
ct::exp<double>(0.0); // Discouraged syntax
cuda::tiles::ceil
-
template<ct::basic_floating_point_tile Tile>
__tile__ Tile ceil(Tile in) noexcept;
-
Performs elementwise ceiling on the operand
in. For each element \(x\) in operandin, the result of the computation is \(\lceil x \rceil\).
cuda::tiles::floor
-
template<ct::basic_floating_point_tile Tile>
__tile__ Tile floor(Tile in) noexcept;
-
Performs elementwise floor on the operand
in. For each element \(x\) in operandin, the result of the computation is \(\lfloor x \rfloor\).
cuda::tiles::pow
-
template<ct::basic_floating_point_tile B, ct::basic_floating_point_tile E>
requires ct::arithmetic_tile_convertible<B, E>
__tile__ ct::arithmetic_tile_conversion_t<B, E> pow(B base, E exponent) noexcept;
-
Performs elementwise exponentiation on the arithmetic tile converted operands
baseandexponent.Let \(a\) and \(b\) be corresponding elements of the converted operands
baseandexponentrespectively.The result of each computation is \(a^b\).
cuda::tiles::exp2
-
template<
ct::subnormals_rounding_mode SubMode = ct::default_subnormals_rounding_mode(),
ct::basic_floating_point_tile Tile
>
requires /* atomic constraint */
__tile__ Tile exp2(Tile in, ct::subnormals_rounding_mode_constant<SubMode> = {}) noexcept;
-
Performs elementwise base two exponentiation of operand
in.For each element \(x\) in operand
in, the result of the computation is\[\operatorname{subround}(2^{\operatorname{subround}(x)})\]where \(\operatorname{subround}\) applies a subnormals rounding mode as determined by the
SubModetemplate argument.The atomic constraint validates that:
If
SubModeis round subnormals to zero, then \(T\) isfloat.The value
SubModeis an enumerator ofct::subnormals_rounding_mode.
cuda::tiles::exp
-
template<ct::basic_floating_point_tile Tile>
__tile__ Tile exp(Tile in) noexcept;
-
Performs elementwise base \(e\) exponentiation on operand
in. For each element \(x\) ofin, the result of the computation is \(e^x\).
cuda::tiles::log2
-
template<ct::basic_floating_point_tile Tile>
__tile__ Tile log2(Tile in) noexcept;
-
Performs elementwise base \(2\) logarithm on operand
in. For each element \(x\) ofin, the result of the computation is \(\operatorname{log}_2(x)\).
cuda::tiles::log
-
template<ct::basic_floating_point_tile Tile>
__tile__ Tile log(Tile in) noexcept;
-
Performs elementwise natural logarithm on operand
in. For each element \(x\) ofin, the result of the computation is \(\operatorname{ln}(x)\).
cuda::tiles::sqrt
-
template<
ct::rounding_mode Mode = ct::default_rounding_mode(),
ct::subnormals_rounding_mode SubMode = ct::default_subnormals_rounding_mode(),
ct::basic_floating_point_tile Tile
>
requires /* atomic constraint */
__tile__ Tile sqrt(Tile x, ct::rounding_mode_constant<Mode> = {}, ct::subnormals_rounding_mode_constant<SubMode> = {}) noexcept;
-
Performs elementwise square root on operand
in.For each element \(x\) of
in, the result of the computation is\[\operatorname{subround}\left(\sqrt{\operatorname{subround}(x)}\right)\]where \(\operatorname{subround}\) applies a subnormals rounding mode as specified by
SubMode.The atomic constraint validates that:
Modeis a precise rounding mode or round approximate.If
Modeis round approximate, then \(T\) is float.If
SubModeis round subnormals to zero, then \(T\) isfloat.The value
ModeandSubModeare enumerators of their respective types.
cuda::tiles::rsqrt
-
template<
ct::subnormals_rounding_mode SubMode = ct::default_subnormals_rounding_mode(),
ct::basic_floating_point_tile Tile
>
requires /* atomic constraint */
__tile__ Tile rsqrt(Tile in, ct::subnormals_rounding_mode_constant<SubMode> = {}) noexcept;
-
Performs elementwise reciprocal square root on operand
in. For each element \(x\) ofin, the result of the computation is\[\operatorname{subround}\left(\frac{1}{\sqrt{\operatorname{subround}(x)}}\right)\]where \(\operatorname{subround}\) applies a subnormals rounding mode as specified by
SubMode.The atomic constraint validates that:
If
SubModeis round subnormals to zero, then \(T\) isfloat.The value
SubModeis an enumerator ofct::subnormals_rounding_mode.
cuda::tiles::cosh
-
template<ct::basic_floating_point_tile Tile>
__tile__ Tile cosh(Tile in) noexcept;
-
Performs elementwise hyperbolic cosine on the input
in. For each element \(x\) ofin, the result of the computation is \(\operatorname{cosh}(x)\).
cuda::tiles::cos
-
template<ct::basic_floating_point_tile Tile>
__tile__ Tile cos(Tile in) noexcept;
-
Performs elementwise cosine on the input
in. For each element \(x\) ofin, the result of the computation is \(\operatorname{cos}(x)\).
cuda::tiles::sinh
-
template<ct::basic_floating_point_tile>
__tile__ Tile sinh(Tile in) noexcept;
-
Performs elementwise hyperbolic sine on the input
in. For each element \(x\) ofin, the result of the computation is \(\operatorname{sinh}(x)\).
cuda::tiles::sin
-
template<ct::basic_floating_point_tile>
__tile__ Tile sin(Tile in) noexcept;
-
Performs elementwise sine on the input
in. For each element \(x\) ofin, the result of the computation is \(\operatorname{sin}(x)\).
cuda::tiles::tanh
-
template<ct::basic_floating_point_tile>
__tile__ Tile tanh(Tile in) noexcept;
-
Performs elementwise hyperbolic tangent on the input
in. For each element \(x\) ofin, the result of the computation is \(\operatorname{tanh}(x)\).
cuda::tiles::tan
-
template<ct::basic_floating_point_tile>
__tile__ Tile tan(Tile in) noexcept;
-
Performs elementwise tangent on the input
in. For each element \(x\) ofin, the result of the computation is \(\operatorname{tan}(x)\).
cuda::tiles::atan2
-
template<ct::arithmetic_tile Lhs, ct::arithmetic_tile Rhs>
requires ct::arithmetic_tile_convertible<Lhs, Rhs> && ct::basic_floating_point_tile<ct::arithmetic_tile_conversion_t<Lhs, Rhs>>
__tile__ ct::arithmetic_tile_conversion_t<Lhs, Rhs> atan2(Lhs y, Rhs x) noexcept;
-
Performs elementwise arctangent for ratio of the arithmetic tile converted operands
yandx.Let \(a\) and \(b\) be corresponding elements of the converted operands
yandxrespectively. The result of each computation is the angle between the positive x axis and the ray connecting the origin and the coordinate \((a, b)\).