nemo_automodel.components.datasets.multimodal

View as Markdown

BAGEL-style multimodal data pipeline for packed three-group training.

Ports the subset of upstream BAGEL data needed to feed BAGEL training from fully AM-native code: VLM SFT, T2I pretrain, and unified image editing. The packed batch schema is shared by Stage 1 and Stage 2; whether VAE / flow-matching tensors are consumed is controlled by the model stage.

Submodules

Package Contents

Data

__all__

API

nemo_automodel.components.datasets.multimodal.__all__ = ['DATASET_REGISTRY', 'DEFAULT_DATASET_INFO', 'DataConfig', 'DistributedIterableD...