nemo_automodel.components.speculative.eagle.registry

View as Markdown

Dispatch registry mapping target architecture -> EAGLE draft model.

Mirrors SpecForge’s modeling/auto.py pattern but keyed by HF architectures string instead of by config class. The string key avoids a hard import dependency on every individual HF *Config class (which can differ between transformers versions) and matches NeMo AutoModel’s existing _transformers/registry.MODEL_ARCH_MAPPING style.

The dense draft (LlamaEagle3DraftModel / LlamaEagleDraftModel) covers most registered architectures: the implementation is config-driven and reads attention_bias, mlp_bias, head_dim, rope_theta / rope_scaling, and rms_norm_eps directly from the target config. Adding an architecture that fits this shape is a one-line registry append. Architectures that need a different draft get a new draft_cls entry pointing at a dedicated draft module — e.g. gpt-oss (GptOssForCausalLM), whose YaRN RoPE (rope_type="yarn") is not implemented by LlamaRotaryEmbedding, uses :class:GptOssEagle3DraftModel (a thin subclass that swaps in gpt-oss’s YaRN rotary so the draft stays positionally consistent with the target). See draft_gpt_oss.py for the rationale.

Module Contents

Classes

NameDescription
DraftSpecHow to build an EAGLE draft model for a particular target architecture.

Functions

NameDescription
_resolveReturn the first registered draft spec matching any architecture in the list.
resolve_eagle1_draft_specResolve the EAGLE-1 / EAGLE-2 draft spec for a target’s config.architectures field.
resolve_eagle3_draft_specResolve the EAGLE-3 draft spec for a target’s config.architectures field.

Data

EAGLE1_DRAFT_REGISTRY

EAGLE3_DRAFT_REGISTRY

_DENSE_ARCHITECTURES

API

class nemo_automodel.components.speculative.eagle.registry.DraftSpec(
draft_cls: type[transformers.PreTrainedModel]
)
Dataclass

How to build an EAGLE draft model for a particular target architecture.

draft_cls
type[PreTrainedModel]
nemo_automodel.components.speculative.eagle.registry._resolve(
architectures: list[str],
registry: dict[str, nemo_automodel.components.speculative.eagle.registry.DraftSpec],
recipe_name: str
) -> nemo_automodel.components.speculative.eagle.registry.DraftSpec

Return the first registered draft spec matching any architecture in the list.

nemo_automodel.components.speculative.eagle.registry.resolve_eagle1_draft_spec(
architectures: list[str]
) -> nemo_automodel.components.speculative.eagle.registry.DraftSpec

Resolve the EAGLE-1 / EAGLE-2 draft spec for a target’s config.architectures field.

nemo_automodel.components.speculative.eagle.registry.resolve_eagle3_draft_spec(
architectures: list[str]
) -> nemo_automodel.components.speculative.eagle.registry.DraftSpec

Resolve the EAGLE-3 draft spec for a target’s config.architectures field.

nemo_automodel.components.speculative.eagle.registry.EAGLE1_DRAFT_REGISTRY: dict[str, DraftSpec] = {arch: (DraftSpec(draft_cls=LlamaEagleDraftModel)) for arch in _DENSE_ARCHITECTU...
nemo_automodel.components.speculative.eagle.registry.EAGLE3_DRAFT_REGISTRY: dict[str, DraftSpec] = {arch: (DraftSpec(draft_cls=LlamaEagle3DraftModel)) for arch in _DENSE_ARCHITECT...
nemo_automodel.components.speculative.eagle.registry._DENSE_ARCHITECTURES: tuple[str, ...] = ('LlamaForCausalLM', 'Phi3ForCausalLM', 'Qwen3ForCausalLM', 'Qwen3MoeForCausalLM...