Bringing Your Own Dataset#
This guide explains how to import your own dataset to be used with NeMo Automodel.
Types of Supported Datasets#
NeMo Automodel supports several types of datasets for different training scenarios:
Completion datasets: Single text sequences for language modeling
Conversation datasets: Multi-turn chat dialogues
Instruction datasets: Question-answer pairs
Multi-modal datasets: Text combined with images and audio
TODO: onboard CPT and multi-turn SFT. Add documentation for all of the above.