Save & Export Audio Data
Save & Export Audio Data
Save & Export Audio Data
Export processed audio data and transcriptions in formats optimized for ASR model training, audio-and-text applications, and downstream analysis workflows.
NeMo Curator’s audio curation pipeline supports several output formats tailored for different use cases:
The primary output format for audio curation is JSONL (JSON Lines):
Standard fields included in audio manifests:
When source_files metadata exists, the writer generates deterministic hashed file names. Otherwise, it generates UUID-based names.
Before export, check your processed data: