> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/nemo/automodel/llms.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/nemo/automodel/_mcp/server.

# nemo_automodel.components.models.common.mtp

Shared scaffolding for Multi-Token Prediction (MTP) auxiliary heads.

MTP follows the DeepSeek-V3 design (Liu et al., 2024). Each MTP "depth"
predicts one additional future token; per depth the input is rolled left by
one position, fused with the previous-depth hidden state, and passed through
an inner block before producing logits via the shared LM head.

Components in this package are model-agnostic. Model-specific glue (building
the inner block out of the model's own decoder layers, wiring HF state-dict
keys) lives in the model's own package.

## Submodules

* **[`nemo_automodel.components.models.common.mtp.mtp`](/nemo-automodel/nemo_automodel/components/models/common/mtp/mtp)**

## Package Contents

### Data

[`__all__`](#nemo_automodel-components-models-common-mtp-__all__)

### API

```python
nemo_automodel.components.models.common.mtp.__all__ = ['MTPConfig', 'MTPModule', 'get_mtp_loss_scaling_factor', 'roll_tensor']
```