> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/nemo/automodel/llms.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/nemo/automodel/_mcp/server.

# nemo_automodel.components.speculative.dflash.target

Target-model wrapper for DFlash training.

Unlike EAGLE-3 (which captures exactly three aux layers and left-shifts the
supervision), DFlash captures an arbitrary set of decoder layers, concatenates
them along the feature dim, and feeds the result to the draft as *context*. No
shifting is applied -- the DFlash block attention mask handles anchor alignment.

## Module Contents

### Classes

| Name                                                                                              | Description                                                              |
| ------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------ |
| [`DFlashTargetBatch`](#nemo_automodel-components-speculative-dflash-target-DFlashTargetBatch)     | Target-model context features needed by the DFlash trainer.              |
| [`HFDFlashTargetModel`](#nemo_automodel-components-speculative-dflash-target-HFDFlashTargetModel) | Capture a set of decoder-layer hidden states from a frozen HF causal LM. |

### API

```python
class nemo_automodel.components.speculative.dflash.target.DFlashTargetBatch(
    hidden_states: torch.Tensor,
    input_ids: torch.Tensor,
    attention_mask: torch.Tensor,
    loss_mask: torch.Tensor
)
```

Dataclass

Target-model context features needed by the DFlash trainer.

```python
class nemo_automodel.components.speculative.dflash.target.HFDFlashTargetModel(
    model: torch.nn.Module,
    target_layer_ids: typing.Sequence[int]
)
```

Capture a set of decoder-layer hidden states from a frozen HF causal LM.

A forward hook on decoder layer `i` captures that layer's output, which in
HuggingFace's `output_hidden_states` convention is `hidden_states[i + 1]`
\-- matching SpecForge's `extract_context_feature` (offset 1).

```python
nemo_automodel.components.speculative.dflash.target.HFDFlashTargetModel._get_transformer_layers() -> list[torch.nn.Module]
```

Return decoder layers as an ordered, integer-indexable list.

```python
nemo_automodel.components.speculative.dflash.target.HFDFlashTargetModel._validate_layer_ids(
    target_layer_ids: typing.Sequence[int]
) -> list[int]
```

```python
nemo_automodel.components.speculative.dflash.target.HFDFlashTargetModel.generate_batch(
    input_ids: torch.Tensor,
    attention_mask: torch.Tensor,
    loss_mask: torch.Tensor
) -> nemo_automodel.components.speculative.dflash.target.DFlashTargetBatch
```

Run the target model and capture the selected layers' hidden states as context.

```python
nemo_automodel.components.speculative.dflash.target.HFDFlashTargetModel.get_input_embeddings() -> torch.nn.Embedding
```

Return the target model input embeddings.