> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/nemo/automodel/llms.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/nemo/automodel/_mcp/server.

# nemo_automodel.components.datasets.dllm.collate

Collate function for dLLM training.

Expects datasets that produce **unshifted** format (`input_ids` +
`loss_mask`, via `_package_tokenized_example(unshifted=True)`).
Goes directly from variable-length sample lists to block-aligned tensors
in a single pass.

Two-stage block-aligned padding layout::

\[real tokens]\[EOS block-pad, loss=1]\[PAD global-pad, loss=0]

## Module Contents

### Classes

| Name                                                                            | Description                                 |
| ------------------------------------------------------------------------------- | ------------------------------------------- |
| [`DLLMCollator`](#nemo_automodel-components-datasets-dllm-collate-DLLMCollator) | Collator for dLLM (diffusion LLM) training. |

### API

```python
class nemo_automodel.components.datasets.dllm.collate.DLLMCollator(
    pad_token_id: int = 0,
    eos_token_id: typing.Optional[int] = None,
    block_size: typing.Optional[int] = None,
    pad_seq_len_divisible: typing.Optional[int] = None,
    max_seq_len: typing.Optional[int] = None,
    supervise_padding: bool = False,
    response_window: bool = False
)
```

Collator for dLLM (diffusion LLM) training.

Goes directly from variable-length sample dicts to block-aligned
tensors in a single pass — no intermediate pad-to-max step.

Expects each sample to have `input_ids`, `loss_mask`, and
`attention_mask` (as produced by
`_package_tokenized_example(unshifted=True)`).

**Parameters:**

Token ID for global (stage-2) padding.

Token ID for block (stage-1) padding.  Only used
when *block\_size* is set.

If set, apply two-stage block-aligned padding.

Round final length to
`lcm(block_size, pad_seq_len_divisible)`.

gemma4 response-window mode. When `True` the EOS
block-fill is RESPONSE-RELATIVE (aligned on the first supervised
position, matching Google's ChunkResponseIntoCanvases) and the fill
is marked **attended** (`attention_mask=1`), and a one-time
single-turn guard rejects multi-turn `loss_mask`. When `False`
(default; llada / nemotron full-sequence denoising) the fill is
ABSOLUTE (block-aligned on the content length) and **not** attended,
and no single-turn guard runs — the pre-response-window behavior.

```python
nemo_automodel.components.datasets.dllm.collate.DLLMCollator.__call__(
    batch: typing.List[typing.Dict[str, list]]
) -> typing.Dict[str, torch.Tensor]
```

```python
nemo_automodel.components.datasets.dllm.collate.DLLMCollator._block_fill_ends(
    content_lengths: torch.Tensor,
    prefix_lengths: torch.Tensor
) -> torch.Tensor
```

Per-sample end of the EOS block-fill, RESPONSE-RELATIVE.

The fill rounds the response length (measured from `prefix`, the response
start) up to a `block_size` multiple, so `fill_end - prefix` is a whole
number of canvas blocks. With `prefix == 0` (no prompt, e.g. plain MDLM)
this reduces to the old absolute rounding, so non-SFT paths are unchanged.

```python
nemo_automodel.components.datasets.dllm.collate.DLLMCollator._compute_target_length(
    fill_ends: torch.Tensor
) -> int
```

```python
nemo_automodel.components.datasets.dllm.collate.DLLMCollator._first_supervised_index(
    loss_mask: list,
    default: int
) -> int
```

staticmethod

First index where `loss_mask` is truthy (the response start), else
`default` (no supervised token -> treat the whole sample as prefix).

```python
nemo_automodel.components.datasets.dllm.collate.DLLMCollator._pad_and_fill(
    samples: typing.List[list],
    content_lengths: torch.Tensor,
    fill_ends: torch.Tensor,
    target_len: int,
    pad_value: int,
    block_pad_value: int,
    dtype: torch.dtype = torch.long
) -> torch.Tensor
```

Pad variable-length lists to *target\_len* with two-stage fill.