> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/nemo/automodel/llms.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/nemo/automodel/_mcp/server.

# nemo_automodel.components.speculative.eagle.core

Core EAGLE-3 training logic for the minimal Llama MVP.

## Module Contents

### Classes

| Name                                                                                           | Description                                                       |
| ---------------------------------------------------------------------------------------------- | ----------------------------------------------------------------- |
| [`Eagle3StepMetrics`](#nemo_automodel-components-speculative-eagle-core-Eagle3StepMetrics)     | Aggregated metrics from one EAGLE-3 training step.                |
| [`Eagle3TrainerModule`](#nemo_automodel-components-speculative-eagle-core-Eagle3TrainerModule) | Draft-side EAGLE-3 trainer module with test-time-training unroll. |

### Functions

| Name                                                                                                             | Description                                                                   |
| ---------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------- |
| [`_compute_target_distribution`](#nemo_automodel-components-speculative-eagle-core-_compute_target_distribution) | Project target logits into draft vocabulary space and build supervision mask. |
| [`_shift_left_with_zero`](#nemo_automodel-components-speculative-eagle-core-_shift_left_with_zero)               | Shift a batched sequence tensor left and zero-fill the tail.                  |

### API

```python
class nemo_automodel.components.speculative.eagle.core.Eagle3StepMetrics(
    loss: torch.Tensor,
    accuracy: torch.Tensor,
    valid_tokens: torch.Tensor
)
```

Dataclass

Aggregated metrics from one EAGLE-3 training step.

```python
class nemo_automodel.components.speculative.eagle.core.Eagle3TrainerModule(
    draft_model: torch.nn.Module,
    selected_token_ids: torch.Tensor,
    selected_token_mask: torch.Tensor,
    ttt_steps: int
)
```

**Bases:** `Module`

Draft-side EAGLE-3 trainer module with test-time-training unroll.

```python
nemo_automodel.components.speculative.eagle.core.Eagle3TrainerModule.forward(
    input_ids: torch.Tensor,
    attention_mask: torch.Tensor,
    loss_mask: torch.Tensor,
    aux_hidden_states: torch.Tensor,
    target_logits: torch.Tensor | None = None,
    target_probs: torch.Tensor | None = None,
    position_mask: torch.Tensor | None = None,
    position_ids: torch.Tensor | None = None,
    seq_lens: torch.Tensor | None = None,
    doc_remaining: torch.Tensor | None = None
) -> nemo_automodel.components.speculative.eagle.core.Eagle3StepMetrics
```

Run the EAGLE-3 unrolled draft loss for one batch.

The attention layer is driven through a shared `cache_hidden`
list so each TTT step can attend to the K/V branches produced by
every previous step at the same position. This matches the
SpecForge `llama3_eagle.py` recurrence; without it, multi-step
TTT would degenerate into `ttt_steps` independent single-step
passes and the draft would never learn the multi-step
distribution it sees at deployment time.

`attention_mask` is held constant across TTT steps -- only
`input_ids` / `loss_mask` / `position_mask` /
`target_probs` roll forward by one position per step.

Packing: `position_ids` / `seq_lens` make the draft's Block-1 attention
document-level block-causal, and `doc_remaining` gates supervision per
step (slot `t` valid at step `k` only while `k &lt; doc_remaining[t]`),
masking every cross-document TTT prediction.

Two supervision sources are accepted: the live path passes the
target's full-vocab `target_logits` and the draft distribution is
derived here; the offline-cache path (`precompute_eagle3`) passes the
already-derived `target_probs` (over the draft vocab) and
`position_mask` directly, so the full-vocab logits never have to be
stored. Provide exactly one of the two.

```python
nemo_automodel.components.speculative.eagle.core._compute_target_distribution(
    target_logits: torch.Tensor,
    selected_token_ids: torch.Tensor,
    selected_token_mask: torch.Tensor,
    loss_mask: torch.Tensor
) -> tuple[torch.Tensor, torch.Tensor]
```

Project target logits into draft vocabulary space and build supervision mask.

```python
nemo_automodel.components.speculative.eagle.core._shift_left_with_zero(
    tensor: torch.Tensor
) -> torch.Tensor
```

Shift a batched sequence tensor left and zero-fill the tail.