cuda.tile.maximum#

cuda.tile.maximum(x, y, /, *, flush_to_zero=False)#

Elementwise maximum on two tiles.

Can also use builtin operation max(x, y).

Parameters:
  • x (Tile) – LHS tile.

  • y (Tile) – RHS tile. flush_to_zero (const bool): If True, flushes subnormal inputs and results to sign-preserving zero, default is False.

The shape of x and y will be broadcasted and dtype promoted to common dtype.

Return type:

Tile

Examples

tx = ct.arange(4, dtype=ct.int32)
ty = ct.full((4,), 2, dtype=ct.int32)
print(max(tx, ty))
import cuda.tile as ct
import torch

@ct.kernel
def kernel():
    tx = ct.arange(4, dtype=ct.int32)
    ty = ct.full((4,), 2, dtype=ct.int32)
    print(max(tx, ty))


torch.cuda.init()
ct.launch(torch.cuda.current_stream(), (1,), kernel, ())
torch.cuda.synchronize()

Output

[2, 2, 2, 3]