cuda.tile.extract#
- cuda.tile.extract(x, /, index, shape)#
Extracts a smaller tile from input tile.
Partition the input tile into a grid with subtile shape and return a tile given the index into the grid. Similar to
load()but performed on a tile.- Parameters:
x (Tile) – input tile.
index (Shape) – Index into the grid of subtiles, not element index. Each dimension
ihasx.shape[i] // shape[i]subtiles; valid values are[0, x.shape[i] // shape[i]). For example, extracting shape(4,)from a(128,)tile gives 32 subtiles, so valid indices are 0–31.shape (Shape) – The shape of the extracted tile. Must evenly divide
x.shapein every dimension.
- Return type:
Examples
1D tile.
tile = ct.arange(8, dtype=ct.int32) sub = ct.extract(tile, (0,), shape=(4,)) print(f'(0,): {sub}') sub = ct.extract(tile, (1,), shape=(4,)) print(f'(1,): {sub}')
import cuda.tile as ct import torch @ct.kernel def kernel(): tile = ct.arange(8, dtype=ct.int32) sub = ct.extract(tile, (0,), shape=(4,)) print(f'(0,): {sub}') sub = ct.extract(tile, (1,), shape=(4,)) print(f'(1,): {sub}') torch.cuda.init() ct.launch(torch.cuda.current_stream(), (1,), kernel, ()) torch.cuda.synchronize()
Output
(0,): [0, 1, 2, 3] (1,): [4, 5, 6, 7]
2D tile.
tile = ct.arange(16, dtype=ct.int32).reshape((4, 4)) sub = ct.extract(tile, (0, 0), shape=(2, 2)) print(f'(0, 0): {sub}') sub = ct.extract(tile, (0, 1), shape=(2, 2)) print(f'(0, 1): {sub}')
import cuda.tile as ct import torch @ct.kernel def kernel(): tile = ct.arange(16, dtype=ct.int32).reshape((4, 4)) sub = ct.extract(tile, (0, 0), shape=(2, 2)) print(f'(0, 0): {sub}') sub = ct.extract(tile, (0, 1), shape=(2, 2)) print(f'(0, 1): {sub}') torch.cuda.init() ct.launch(torch.cuda.current_stream(), (1,), kernel, ()) torch.cuda.synchronize()
Output
(0, 0): [[0, 1], [4, 5]] (0, 1): [[2, 3], [6, 7]]