nemo_automodel.components.models.glm_moe_dsa.kernels

View as Markdown

Vendored GLM-5.2 DSA TileLang kernels.

slime GLM-5.2 kernels

The vendored lighting-indexer and sparse-MLA kernels were adapted from THUDM’s slime GLM-5.2 plugin:

The slime kernels are themselves adapted from the tile-ai/tilelang DeepSeek-V3.2 examples (per-file upstream links are preserved in each file header).

Per-file source mapping:

=============================== ============================================================== Local file Upstream file (slime_plugins/models/glm5/ops/) =============================== ============================================================== indexer.py indexer.py sparse_mla.py sparse_mla.py tilelang_indexer_fwd.py tilelang_indexer_fwd.py tilelang_indexer_bwd.py tilelang_indexer_bwd.py tilelang_sparse_mla_fwd.py tilelang_sparse_mla_fwd.py tilelang_sparse_mla_bwd.py tilelang_sparse_mla_bwd.py =============================== ==============================================================

Local modifications: each raw kernel file imports T/tilelang through the local _tilelang.py lazy shim, matching the DeepSeek-V4 kernels. This lets AutoModel import without importing the optional TileLang runtime; real TileLang is loaded only when a TileLang kernel is called. The kernels are wired into AutoModel’s GLM-5.2 DSA layers via optimized_kernels.py and gated behind backend.attn == "tilelang".

These kernels require tilelang (an optional dependency). Note tilelang 0.1.11 must be paired with apache-tvm-ffi==0.1.11; apache-tvm-ffi 0.1.12 breaks import tilelang with a tvm-ffi type double-registration error.

Submodules