Quick Start Guide#

The CUTLASS DSL 4.4 release currently supports Linux and Python 3.10 - 3.13 only. To install CUTLASS DSLs (limited to CuTe DSL for now), use the following command

Installation#

Before installing the latest version, you need to uninstall any previous CUTLASS DSL Installation.

pip uninstall nvidia-cutlass-dsl nvidia-cutlass-dsl-libs-base nvidia-cutlass-dsl-libs-cu13 -y

To ensure compatibility with the examples and code on GitHub, use the setup.sh file from the corresponding commit in the repository.

git clone https://github.com/NVIDIA/cutlass.git

# For CUDA Toolkit 12.9:
./cutlass/python/CuTeDSL/setup.sh --cu12

# For CUDA Toolkit 13.1:
./cutlass/python/CuTeDSL/setup.sh --cu13

If you just want to try out the last known stable release of the CUTLASS DSL (may not be compatible with the latest examples and code), run:

# For CUDA Toolkit 12.9:
pip install nvidia-cutlass-dsl

# For CUDA Toolkit 13.1:
pip install nvidia-cutlass-dsl[cu13]

The nvidia-cutlass-dsl wheel includes everything needed to generate GPU kernels. It requires the same NVIDIA driver version as the corresponding CUDA Toolkit (CUDA Toolkit 12.9 or CUDA Toolkit 13.1).