cuda.tile.pack_to_bytes#
- cuda.tile.pack_to_bytes(x, /)#
Flattens a tile and reinterprets its raw bytes as uint8 elements.
The total number of bits of the input tile must be divisible by 8.
- Parameters:
x (Tile) – input tile.
- Returns:
a 1D uint8 tile with
total_elements * bit width // 8elements.- Return type:
Examples
tx = ct.full(1, 0x04030201, dtype=ct.int32) ty = ct.pack_to_bytes(tx) print(ty)
import cuda.tile as ct import torch @ct.kernel def kernel(): tx = ct.full(1, 0x04030201, dtype=ct.int32) ty = ct.pack_to_bytes(tx) print(ty) torch.cuda.init() ct.launch(torch.cuda.current_stream(), (1,), kernel, ()) torch.cuda.synchronize()
Output
[1, 2, 3, 4]