nemo_automodel.components.models.step3p7.mtp
nemo_automodel.components.models.step3p7.mtp
Step3 Multi-Token Prediction blocks.
Step checkpoints store MTP depths after the main decoder layers as
model.layers.{num_hidden_layers + depth}.*. Each depth has the same
decoder block structure plus fusion modules (enorm, hnorm, eh_proj)
and an MTP-local shared head under transformer.shared_head.
Module Contents
Classes
Functions
API
Bases: Module
Stack of Step MTP depths.
layers
num_depths
Bases: Module
Per-depth Step MTP prediction head.
norm
output
Return a shallow config copy patched for a dense sliding-attention MTP layer.
Build Step MTP runtime config from HF-style config fields.
Construct Step MTP depths.