cuda.tile.add#
- cuda.tile.add(x, y, /, *, rounding_mode=None, flush_to_zero=False)#
Elementwise add on two tiles.
Can also use builtin operation x + y.
- Parameters:
x (Tile) – LHS tile.
y (Tile) – RHS tile. rounding_mode (RoundingMode): The rounding mode for the operation, only supported for float types, default is RoundingMode.RN when applicable. flush_to_zero (const bool): If True, flushes subnormal inputs and results to sign-preserving zero, default is False.
The shape of x and y will be broadcasted and dtype promoted to common dtype.
- Return type:
Examples
tx = ct.arange(4, dtype=ct.int32) print(tx + 10)
import cuda.tile as ct import torch @ct.kernel def kernel(): tx = ct.arange(4, dtype=ct.int32) print(tx + 10) torch.cuda.init() ct.launch(torch.cuda.current_stream(), (1,), kernel, ()) torch.cuda.synchronize()
Output
[10, 11, 12, 13]