nemo_automodel.components.datasets.multimodal.packing
nemo_automodel.components.datasets.multimodal.packing
PackedDataset + DataConfig — packed-sequence iterable for BAGEL training.
Module Contents
Classes
Data
API
Container for the packing-level knobs + grouped-dataset YAML dict.
Bases: IterableDataset
Greedy pack of samples drawn from weighted groups into token-budgeted batches.
The dataset reseeds at iterator start so AM sees a deterministic BAGEL-compatible packed-data stream regardless of earlier RNG consumption during model construction or checkpoint loading.
_drop_counters
_resume_buffer
_resume_sequence_status
_yielded_batches