cuda.tile.rsqrt#
- cuda.tile.rsqrt(x, /, *, flush_to_zero=False)#
Perform rsqrt on a tile.
- Parameters:
x (Tile)
flush_to_zero (const bool) – If True, flushes subnormal inputs and results to sign-preserving zero, default is False.
- Return type:
Examples
tx = ct.full((4,), 4.0, dtype=ct.float32) print(f"{ct.rsqrt(tx):.1f}")
import cuda.tile as ct import torch @ct.kernel def kernel(): tx = ct.full((4,), 4.0, dtype=ct.float32) print(f"{ct.rsqrt(tx):.1f}") torch.cuda.init() ct.launch(torch.cuda.current_stream(), (1,), kernel, ()) torch.cuda.synchronize()
Output
[0.5, 0.5, 0.5, 0.5]