Load Local Audio Files
Load audio files from local directories by creating custom manifests that reference your audio files. This guide covers supported formats and basic approaches for organizing local audio data for NeMo Curator processing.
Overview
To process local audio files with NeMo Curator, you need to create a manifest file that lists your audio files and their metadata. NeMo Curator does not provide automatic audio file discovery - you must create a JSONL manifest first.
Supported Audio Formats
NeMo Curator supports audio formats compatible with the soundfile library:
MP3 (.mp3) support depends on your system’s libsndfile build. For the most reliable behavior across environments, prefer WAV (.wav) or FLAC (.flac) formats.
Creating Manifests for Local Files
Basic Manifest Creation
Create a JSONL manifest file that lists your local audio files:
Directory Organization Examples
Paired Audio-Text Files:
Separated Directories:
Processing Local Audio with Manifest
After creating your manifest, process it with NeMo Curator:
Manifest with Existing Transcriptions
If you have existing transcription files, include them in your manifest:
Best Practices
Organize Your Files
Structure your audio files for easy manifest creation:
Validate File Paths
Ensure all audio files exist before processing: