Common#

class cutlass.cute.nvgpu.OpError(*args: Any, **kwargs: Any)#

Bases: DSLBaseError

An exception class for Op construction errors.

class cutlass.cute.nvgpu.MmaUniversalOp(abacc_dtype: Type[cutlass.cute.typing.Numeric])#

Bases: MmaOp

The universal MMA Operation.

This Operation currently expects the A/B operands as well as the accumulator to share the same data types.

Parameters:

abacc_dtype (Type[Numeric]) – The data type for the A/B operands and the accumulator

abacc_dtype: Type[cutlass.cute.typing.Numeric]#
class cutlass.cute.nvgpu.CopyUniversalOp#

Bases: CopyOp

The universal Copy Operation.

When creating a Copy Atom out of this operation, the expected usage pattern is

op = cute.nvgpu.CopyUniversalOp()
atom = cute.make_copy_atom(op, tensor_dtype, num_bits_per_copy=64)
  • tensor_dtype is the data type used to build the reference TV Layout (either the source or the destination TV Layout) in unit of tensor elements and is used for partitioning by TiledCopy for example

  • num_bits_per_copy is a kw argument specifying the number of bits to copy per Atom execution. This can be larger than the width of the above data type. When not provided, the compiler will do a best effort at auto-vectorizing.