nemo_automodel.components.datasets.vlm.datasets
#
Module Contents#
Functions#
Load and preprocess the RDR dataset for image-to-text fine-tuning. |
|
Load and preprocess the CORD-V2 dataset for image-to-text fine-tuning. |
|
Load and preprocess the MedPix dataset for image-to-text fine-tuning. |
|
Load and preprocess the CommonVoice 17 dataset for audio-to-text fine-tuning. |
API#
- nemo_automodel.components.datasets.vlm.datasets.make_rdr_dataset(
- path_or_dataset='quintend/rdr-items',
- split='train',
- **kwargs,
Load and preprocess the RDR dataset for image-to-text fine-tuning.
- Parameters:
path_or_dataset (str) – Path or identifier for the RDR dataset.
split (str) – Dataset split to load.
**kwargs – Additional arguments.
- Returns:
The processed dataset.
- Return type:
Dataset
- nemo_automodel.components.datasets.vlm.datasets.make_cord_v2_dataset(
- path_or_dataset='naver-clova-ix/cord-v2',
- split='train',
- **kwargs,
Load and preprocess the CORD-V2 dataset for image-to-text fine-tuning.