Load audio files from local directories by creating custom manifests that reference your audio files. This guide covers supported formats and basic approaches for organizing local audio data for NeMo Curator processing.
To process local audio files with NeMo Curator, you need to create a manifest file that lists your audio files and their metadata. NeMo Curator does not provide automatic audio file discovery - you must create a JSONL manifest first.
NeMo Curator supports audio formats compatible with the soundfile library:
MP3 (.mp3) support depends on your system’s libsndfile build. For the most reliable behavior across environments, prefer WAV (.wav) or FLAC (.flac) formats.
Create a JSONL manifest file that lists your local audio files:
Paired Audio-Text Files:
Separated Directories:
After creating your manifest, process it with NeMo Curator:
If you have existing transcription files, include them in your manifest:
Structure your audio files for easy manifest creation:
Ensure all audio files exist before processing: