nemo_automodel.components.datasets.multimodal.interleave
nemo_automodel.components.datasets.multimodal.interleave
Interleaved-image parquet datasets for BAGEL editing + joint recipes.
Provides:
- :class:
InterleavedBaseIterableDataset– mixin that exposes_init_data/_add_text/_add_image/_add_videobuilders for per-row assembly of the packed-sequence plan. - :class:
ParquetStandardIterableDataset– base class that iterates per-row-group over a list of parquet files; subclasses overrideparse_rowto turn a pandas row into adictcompatible with :class:.packing.PackedDataset. - :class:
UnifiedEditIterableDataset– concrete parse_row that emits interleaved (input-image, instruction, output-image) samples from an image-editing parquet schema (image_list+instruction_list).
When visual_gen=False these samples can still flow through packing while
the model ignores VAE / flow-matching tensors. Stage 2 consumes the same
yielded sample dicts for edit-generation loss.
Module Contents
Classes
Data
API
Bases: DistributedIterableDataset
Builder mixin for interleaved image/text/video sequence plans.
Subclasses still provide __init__ + __iter__ + parse_row
(via :class:ParquetStandardIterableDataset); this class only holds
the per-item append helpers used inside parse_row.
Bases: DistributedIterableDataset
Base class: iterate per-(file, row_group) across a list of parquet shards.
Subclasses override :meth:parse_row to turn one pandas row into the
dict schema consumed by :class:.packing.PackedDataset.
Bases: InterleavedBaseIterableDataset, ParquetStandardIterableDataset
Image-editing dataset: (input, instruction, output) chains over parquet.
Row schema (upstream BAGEL seedxedit_multi + compatibles):
image_list: list of raw image bytes (at least 2).
instruction_list: list of lists; instruction_list[i] is a
set of equivalent phrasings for the edit that turns
image_list[i] into image_list[i+1].