nemo_automodel.components.speculative.eagle.registry
nemo_automodel.components.speculative.eagle.registry
Dispatch registry mapping target architecture -> EAGLE draft model.
Mirrors SpecForge’s modeling/auto.py pattern but keyed by HF
architectures string instead of by config class. The string key avoids a
hard import dependency on every individual HF *Config class (which can
differ between transformers versions) and matches NeMo AutoModel’s existing
_transformers/registry.MODEL_ARCH_MAPPING style.
The dense draft (LlamaEagle3DraftModel / LlamaEagleDraftModel) covers
most registered architectures: the implementation is config-driven and reads
attention_bias, mlp_bias, head_dim, rope_theta /
rope_scaling, and rms_norm_eps directly from the target config.
Adding an architecture that fits this shape is a one-line registry
append. Architectures that need a different draft get a new draft_cls
entry pointing at a dedicated draft module — e.g. gpt-oss
(GptOssForCausalLM), whose YaRN RoPE (rope_type="yarn") is not
implemented by LlamaRotaryEmbedding, uses :class:GptOssEagle3DraftModel
(a thin subclass that swaps in gpt-oss’s YaRN rotary so the draft stays
positionally consistent with the target). See draft_gpt_oss.py for the
rationale.
Module Contents
Classes
Functions
Data
API
How to build an EAGLE draft model for a particular target architecture.
Return the first registered draft spec matching any architecture in the list.
Resolve the EAGLE-1 / EAGLE-2 draft spec for a target’s config.architectures field.
Resolve the EAGLE-3 draft spec for a target’s config.architectures field.