cuda.tile.maximum#
- cuda.tile.maximum(x, y, /, *, flush_to_zero=False)#
Elementwise maximum on two tiles.
Can also use builtin operation max(x, y).
- Parameters:
The shape of x and y will be broadcasted and dtype promoted to common dtype.
- Return type:
Examples
tx = ct.arange(4, dtype=ct.int32) ty = ct.full((4,), 2, dtype=ct.int32) print(max(tx, ty))
import cuda.tile as ct import torch @ct.kernel def kernel(): tx = ct.arange(4, dtype=ct.int32) ty = ct.full((4,), 2, dtype=ct.int32) print(max(tx, ty)) torch.cuda.init() ct.launch(torch.cuda.current_stream(), (1,), kernel, ()) torch.cuda.synchronize()
Output
[2, 2, 2, 3]