> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/nemo/automodel/llms.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/nemo/automodel/_mcp/server.

# nemo_automodel.components.models.deepseek_v4.kernels

Vendored DeepSeek V4 TileLang kernels.

## Miles DeepSeek V4 kernels

The vendored sparse attention and indexer kernels were adapted from the Miles
DeepSeek V4 implementation:

* Upstream project: [https://github.com/yueming-yuan/miles](https://github.com/yueming-yuan/miles)
* Upstream branch: `deepseek-v4`
* Upstream revision: `e561465d0b9bbf06188b7a5e2020dc7fd691f732`
* Upstream source tree:
  [https://github.com/yueming-yuan/miles/tree/e561465d0b9bbf06188b7a5e2020dc7fd691f732/miles\_plugins/models/deepseek\_v4/ops](https://github.com/yueming-yuan/miles/tree/e561465d0b9bbf06188b7a5e2020dc7fd691f732/miles_plugins/models/deepseek_v4/ops)
* Upstream license: Apache License 2.0
* Upstream copyright notice from the Miles license: Copyright 2025 Zhipu AI
* Upstream NOTICE file: none present at the referenced revision

Per-file source mapping:

\===============================  ==============================================================
Local file                       Upstream file
\===============================  ==============================================================
`sparse_attention.py`          `miles_plugins/models/deepseek_v4/ops/attention_core.py`
`tilelang_indexer.py`          `miles_plugins/models/deepseek_v4/ops/kernel/tilelang_indexer.py`
`tilelang_indexer_bwd.py`      `miles_plugins/models/deepseek_v4/ops/kernel/tilelang_indexer_bwd.py`
`tilelang_indexer_fwd.py`      `miles_plugins/models/deepseek_v4/ops/kernel/tilelang_indexer_fwd.py`
`tilelang_sparse_mla_bwd.py`   `miles_plugins/models/deepseek_v4/ops/kernel/tilelang_sparse_mla_bwd.py`
`tilelang_sparse_mla_fwd.py`   `miles_plugins/models/deepseek_v4/ops/kernel/tilelang_sparse_mla_fwd.py`
\===============================  ==============================================================

Local modifications include adapting the kernels to AutoModel's DeepSeek V4
tensor layouts, packed-sequence dispatch, optional backend selection, and
forward/backward parity tests against the torch reference implementation.

## DeepSeek TileKernels

The Sinkhorn optimized path imports DeepSeek TileKernels at runtime. AutoModel
does not vendor TileKernels source code.

* Upstream project: [https://github.com/deepseek-ai/TileKernels](https://github.com/deepseek-ai/TileKernels)
* Upstream revision used for validation: `36d9e45d38e204ebb87e6f6e833821eee0482fe5`
* Imported symbol: `tile_kernels.modeling.mhc.ops.sinkhorn_normalize`
* Upstream source:
  [https://github.com/deepseek-ai/TileKernels/blob/36d9e45d38e204ebb87e6f6e833821eee0482fe5/tile\_kernels/modeling/mhc/ops/sinkhorn.py](https://github.com/deepseek-ai/TileKernels/blob/36d9e45d38e204ebb87e6f6e833821eee0482fe5/tile_kernels/modeling/mhc/ops/sinkhorn.py)
* Upstream license: MIT License
* Upstream copyright notice: Copyright 2026 DeepSeek

## Submodules

* **[`nemo_automodel.components.models.deepseek_v4.kernels._tilelang`](/nemo-automodel/nemo_automodel/components/models/deepseek_v4/kernels/_tilelang)**
* **[`nemo_automodel.components.models.deepseek_v4.kernels.sparse_attention`](/nemo-automodel/nemo_automodel/components/models/deepseek_v4/kernels/sparse_attention)**
* **[`nemo_automodel.components.models.deepseek_v4.kernels.tilelang_indexer`](/nemo-automodel/nemo_automodel/components/models/deepseek_v4/kernels/tilelang_indexer)**
* **[`nemo_automodel.components.models.deepseek_v4.kernels.tilelang_indexer_bwd`](/nemo-automodel/nemo_automodel/components/models/deepseek_v4/kernels/tilelang_indexer_bwd)**
* **[`nemo_automodel.components.models.deepseek_v4.kernels.tilelang_indexer_fwd`](/nemo-automodel/nemo_automodel/components/models/deepseek_v4/kernels/tilelang_indexer_fwd)**
* **[`nemo_automodel.components.models.deepseek_v4.kernels.tilelang_sparse_mla_bwd`](/nemo-automodel/nemo_automodel/components/models/deepseek_v4/kernels/tilelang_sparse_mla_bwd)**
* **[`nemo_automodel.components.models.deepseek_v4.kernels.tilelang_sparse_mla_fwd`](/nemo-automodel/nemo_automodel/components/models/deepseek_v4/kernels/tilelang_sparse_mla_fwd)**