cuda.tile.add#

cuda.tile.add(x, y, /, *, rounding_mode=None, flush_to_zero=False)#

Elementwise add on two tiles.

Can also use builtin operation x + y.

Parameters:
  • x (Tile) – LHS tile.

  • y (Tile) – RHS tile. rounding_mode (RoundingMode): The rounding mode for the operation, only supported for float types, default is RoundingMode.RN when applicable. flush_to_zero (const bool): If True, flushes subnormal inputs and results to sign-preserving zero, default is False.

The shape of x and y will be broadcasted and dtype promoted to common dtype.

Return type:

Tile

Examples

tx = ct.arange(4, dtype=ct.int32)
print(tx + 10)
import cuda.tile as ct
import torch

@ct.kernel
def kernel():
    tx = ct.arange(4, dtype=ct.int32)
    print(tx + 10)


torch.cuda.init()
ct.launch(torch.cuda.current_stream(), (1,), kernel, ())
torch.cuda.synchronize()

Output

[10, 11, 12, 13]