DevicePipeline#

class nvmath.device.DevicePipeline(
mm: Matmul,
pipeline_depth: int,
a: ndarray,
b: ndarray,
)[source]#

DevicePipeline allows users to optimally configure kernel calls for pipelined matrix multiplication. It also provides an access point for getting a TilePipeline object within a kernel.

Refer to the cuBLASDx documentation for more details on how to use this class: using_pipelines.html

Attributes

a_strides#
b_strides#
block_dim#
buffer_alignment#
buffer_size#
storage_alignment#
storage_bytes#

Methods

get_tile(
smem: ndarray,
blockIdx_x: int,
blockIdx_y: int,
) TilePipeline[source]#
reset_tile(
tile_pipeline: TilePipeline,
idx: int | tuple[int, int],
idy: int | tuple[int, int],
)[source]#