nemo_automodel.components.models.deepseek_v4.kernels.tilelang_indexer_fwd
nemo_automodel.components.models.deepseek_v4.kernels.tilelang_indexer_fwd
Module Contents
Functions
API
Generate cu_seqlens for causal masking on compressed KV positions.
For query at position p, valid compressed groups are [0, (p+1) // compress_ratio).
Batched forward: loops over batch dim.
Parameters:
q
[seqlen, batch, heads, dim] bf16
k
[seqlen_kv, batch, dim] bf16
weights
[seqlen, batch, heads] fp32
cu_seqlen_ks
[seqlen] int32
cu_seqlen_ke
[seqlen] int32
Returns:
[batch, seqlen, seqlen_kv] fp32
Forward interface matching GLM-5’s API but for a single batch element.
Parameters:
q
[seq_len, heads, index_dim] bf16
kv
[seq_len_kv, index_dim] bf16
weights
[seq_len, heads] fp32
cu_seqlen_ks
[seq_len] int32 — start of valid KV range per query
cu_seqlen_ke
[seq_len] int32 — end of valid KV range per query
Returns:
[seq_len, seq_len_kv] fp32