Runtime#

Description

API documentation#

class cutlass.cute.runtime._Pointer(*args: Any, **kwargs: Any)#

Bases: Pointer

Runtime representation of a pointer that can inter-operate with various data structures, including numpy arrays and device memory.

Parameters:
  • pointer (int or pointer-like object) – The pointer to the data

  • dtype (Type) – Data type of the elements pointed to

  • mem_space (_cute_ir.AddressSpace, optional) – Memory space where the pointer resides, defaults to generic

  • assumed_align (int, optional) – Assumed alignment of input pointer in bytes, defaults to None

Variables:
  • _pointer – The underlying pointer

  • _dtype – Data type of the elements

  • _addr_space – Memory space of the pointer

  • _assumed_align – Alignment of the pointer in bytes

  • _desc – C-type descriptor for the pointer

  • _c_pointer – C-compatible pointer representation

__init__(
pointer,
dtype,
mem_space: cutlass._mlir.dialects.cute.AddressSpace = cutlass._mlir.dialects.cute.AddressSpace.generic,
assumed_align=None,
)#
size_in_bytes() int#
property mlir_type: cutlass._mlir.ir.Type#
property dtype: cutlass.cute.typing.Type.cutlass.cute.typing.Numeric#
property memspace#
align(
min_align: int,
*,
loc=None,
ip=None,
) cutlass.cute.typing.Pointer#
verify(expected_py_type)#
class cutlass.cute.runtime._Tensor(*args: Any, **kwargs: Any)#

Bases: Tensor

__init__(tensor, assumed_align=None, use_32bit_stride=False)#
lazily_load_dltensor()#

Decorator to lazily load the DLTensorWrapper.

This decorator loads the DLTensorWrapper when needed, avoiding overhead in the critical path of calling JIT functions.

mark_layout_dynamic(*args, **kwargs)#
mark_compact_shape_dynamic(*args, **kwargs)#
property element_type#
property memspace#
property size_in_bytes#
property mlir_type#
property iterator#
property layout#
property shape#
property stride#
property leading_dim#

Get the leading dimension of this Tensor.

Returns:

The leading dimension index or indices

Return type:

int or tuple or None

The return value depends on the tensor’s stride pattern:

  • If a single leading dimension is found, returns an integer index

  • If nested leading dimensions are found, returns a tuple of indices

  • If no leading dimension is found, returns None

fill(value: cutlass.cute.typing.Numeric)#
property data_ptr#
cutlass.cute.runtime.from_dlpack(
tensor_dlpack,
assumed_align=None,
use_32bit_stride=False,
) cutlass.cute.typing.Tensor#

Convert from tensor object supporting __dlpack__() to a CuTe Tensor.

Parameters:
  • tensor_dlpack (object) – Tensor object that supports the DLPack protocol

  • assumed_align (int, optional) – Assumed alignment of the tensor (bytes), defaults to None, if None, will use the element size bytes as the assumed alignment.

  • use_32bit_stride (bool, optional) – Whether to use 32-bit stride, defaults to False. When True, the dynamic stride bitwidth will be set to 32 for small problem size (cosize(layout) <= Int32_max) for better performance. This is only applied when the dimension is dynamic.

Returns:

A CuTe Tensor object

Return type:

Tensor

Examples

import torch
from cutlass.cute.runtime import from_dlpack
x = torch.randn(100, 100)
y = from_dlpack(x)
y.shape
# (100, 100)
type(y)
# <class 'cutlass.cute.Tensor'>
cutlass.cute.runtime.make_ptr(
dtype: cutlass.cute.typing.Type.cutlass.cute.typing.Numeric,
value: int | _Pointer,
mem_space: cutlass.cute.typing.AddressSpace = cutlass.cute.typing.AddressSpace.generic,
assumed_align=None,
) cutlass.cute.typing.Pointer#

Create a pointer from a memory address

Parameters:
  • dtype (Type[Numeric]) – Data type of the pointer elements

  • value (Union[int, ctypes._Pointer]) – Memory address as integer or ctypes pointer

  • mem_space (AddressSpace, optional) – Memory address space, defaults to AddressSpace.generic

  • align_bytes (int, optional) – Alignment in bytes, defaults to None

Returns:

A pointer object

Return type:

Pointer

import numpy as np
import ctypes

from cutlass import Float32
from cutlass.cute.runtime import make_ptr

# Create a numpy array
a = np.random.randn(16, 32).astype(np.float32)

# Get pointer address as integer
ptr_address = a.ctypes.data_as(ctypes.POINTER(ctypes.c_float))

# Create pointer from address
y = make_ptr(cutlass.Float32, ptr_address)

# Check properties
print(y.element_type)
print(type(y))  # <class 'cutlass.cute.Pointer'>
class cutlass.cute.runtime.TensorAdapter(arg)#

Bases: object

Convert a DLPack protocol supported tensor/array to a cute tensor.

__init__(arg)#