Partitioner#
-
class nvmath.
device. Partitioner(*args)[source]# Partitioner is an abstraction for partitioning a global memory tensor into a partitioned tensor.
Note
Do not create directly, use
nvmath..device. Matmul. suggest_partitioner() Refer to the cuBLASDx documentation for more details on how to use this class: https://docs.nvidia.com/cuda/cublasdx/api/other_tensors.html#partitioner-register-tensor-other-label
Methods
- abstract is_index_in_bounds(index: int) bool[source]#
Checks if the given index is within the bounds of the partitioned tensor. This is used to prevent out-of-bounds access in the kernel.
- abstract is_predicated() bool[source]#
Checks if the current thread is predicated. This is used to determine if the thread should execute the kernel.
- abstract map_fragment_index(fragment_index: int) tuple[int, int][source]#
Maps the given fragment index to a global memory index. This is used to access the correct element in the partitioned tensor.
- abstract partition_like_C(
- gmem_c: OpaqueTensor,
Partitions the given global memory tensor
gmem_cinto a partitioned tensor. The partitioned tensor is used for accessing the C matrix when working with register fragment.