nemo_automodel.components.datasets.llm.mock_iterable_dataset
nemo_automodel.components.datasets.llm.mock_iterable_dataset
Module Contents
Classes
API
Bases: IterableDataset
Mock dataset that generates synthetic data for benchmarking.
This dataset generates random tokens similar to the benchmarking script, creating input_ids, labels, and position_ids for each sample.
Generate synthetic batches.
Return the number of samples.