warp submodule#

class cutlass.cute.nvgpu.warp.MmaF16BF16Op( ab_dtype: Type[cutlass.cute.typing.Numeric], acc_dtype: Type[cutlass.cute.typing.Numeric], shape_mnk: cutlass.cute.typing.Shape, )#

Bases: MmaOp

F16/BF16 tcgen05 MMA Operation.

See the PTX documentation. This Operation covers the instructions using the .f16 or .bf16 qualifiers for the input operands.

ab_dtype: Type[cutlass.cute.typing.Numeric]#

acc_dtype: Type[cutlass.cute.typing.Numeric]#

shape_mnk: cutlass.cute.typing.Shape#

__init__( ab_dtype: Type[cutlass.cute.typing.Numeric], acc_dtype: Type[cutlass.cute.typing.Numeric], shape_mnk: cutlass.cute.typing.Shape, ) → None#

class cutlass.cute.nvgpu.warp.LdMatrix8x8x16bOp(transpose: bool = False, num_matrices: int = 1)#

Bases: BaseOp

8x8 ldmatrix Operation.

See the PTX documentation. This operation corresponds to the .m8n8 qualifier.

__init__( transpose: bool = False, num_matrices: int = 1, ) → None#

class cutlass.cute.nvgpu.warp.LdMatrix16x16x8bOp(num_matrices: int)#

Bases: BaseOp

16x16 8-bit ldmatrix Operation.

See the PTX documentation. This operation corresponds to the .m16n16 and the .b16 qualifiers.

__init__(num_matrices: int) → None#

class cutlass.cute.nvgpu.warp.StMatrix8x8x16bOp(transpose: bool = False, num_matrices: int = 1)#

Bases: BaseOp

8x8 stmatrix Operation.

See the PTX documentation. This operation corresponds to the m8n8 qualifier.

__init__( transpose: bool = False, num_matrices: int = 1, ) → None#

class cutlass.cute.nvgpu.warp.StMatrix16x8x8bOp(num_matrices: int)#

Bases: BaseOp

16x8 stmatrix Operation.

See the PTX documentation. This operation corresponds to the m16n8 qualifier.

__init__(num_matrices: int) → None#