Audio Curation Pipeline (Overview)
Audio Curation Pipeline (Overview)
Audio Curation Pipeline (Overview)
This guide provides an overview of the end-to-end audio curation workflow in NVIDIA NeMo Curator. It covers data ingestion and validation, optional ASR inference, quality assessment, filtering, and export or conversion. For detailed ASR pipeline information, refer to ASR Pipeline.
Data Ingestion and Validation:
AudioTask file existence checks using validate() and validate_item()Optional ASR Inference:
InferenceAsrNemoStage for automatic speech recognitionbatch_size and resources parametersQuality Assessment:
GetAudioDurationStageFiltering and Quality Control:
PreserveByValueStageExport and Format Conversion:
AudioToDocumentStageASR-First Workflow (Most Common):
AudioTask formatDocumentBatch for text processing integrationQuality-First Workflow (No ASR Required):