> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/nemo/automodel/llms.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/nemo/automodel/_mcp/server.

# nemo_automodel.components.datasets.multimodal

BAGEL-style multimodal data pipeline for packed three-group training.

Ports the subset of upstream BAGEL `data` needed to feed BAGEL training from
fully AM-native code: VLM SFT, T2I pretrain, and unified image editing. The
packed batch schema is shared by Stage 1 and Stage 2; whether VAE /
flow-matching tensors are consumed is controlled by the model stage.

## Submodules

* **[`nemo_automodel.components.datasets.multimodal.collate_fns`](/nemo-automodel/nemo_automodel/components/datasets/multimodal/collate_fns)**
* **[`nemo_automodel.components.datasets.multimodal.datasets`](/nemo-automodel/nemo_automodel/components/datasets/multimodal/datasets)**
* **[`nemo_automodel.components.datasets.multimodal.distributed_iterable`](/nemo-automodel/nemo_automodel/components/datasets/multimodal/distributed_iterable)**
* **[`nemo_automodel.components.datasets.multimodal.interleave`](/nemo-automodel/nemo_automodel/components/datasets/multimodal/interleave)**
* **[`nemo_automodel.components.datasets.multimodal.packing`](/nemo-automodel/nemo_automodel/components/datasets/multimodal/packing)**
* **[`nemo_automodel.components.datasets.multimodal.parquet_utils`](/nemo-automodel/nemo_automodel/components/datasets/multimodal/parquet_utils)**
* **[`nemo_automodel.components.datasets.multimodal.transforms`](/nemo-automodel/nemo_automodel/components/datasets/multimodal/transforms)**
* **[`nemo_automodel.components.datasets.multimodal.utils`](/nemo-automodel/nemo_automodel/components/datasets/multimodal/utils)**
* **[`nemo_automodel.components.datasets.multimodal.video`](/nemo-automodel/nemo_automodel/components/datasets/multimodal/video)**

## Package Contents

### Data

[`__all__`](#nemo_automodel-components-datasets-multimodal-__all__)

### API

```python
nemo_automodel.components.datasets.multimodal.__all__ = ['DATASET_REGISTRY', 'DEFAULT_DATASET_INFO', 'DataConfig', 'DistributedIterableD...
```