nemo_automodel.components.models.deepseek_v4.kernels.tilelang_indexer
nemo_automodel.components.models.deepseek_v4.kernels.tilelang_indexer
TileLang-based DSA Indexer for DeepSeek-V4.
Adapts GLM-5’s lighting_indexer to V4’s SBHD data layout and causal masking. Provides both a low-level per-sample interface and a batched autograd Function.
Module Contents
Classes
Functions
API
Bases: Function
Autograd function for V4 tilelang indexer.
Inputs are in V4’s native SBHD layout: q: [seqlen, batch, heads, dim] bf16 k: [seqlen_kv, batch, dim] bf16 weights: [seqlen, batch, heads] fp32
Main entry point for V4 tilelang indexer.
Parameters:
[seqlen, batch, heads, dim] bf16
[seqlen_kv, batch, dim] bf16
[seqlen, batch, heads] fp32
compression ratio (4 for C4 layers)
number of top-k indices to select
optional pre-computed topk indices [batch, seqlen, topk] int32
Returns:
[batch, seqlen, topk] fp32