> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/nemo/curator/llms.txt.
> For full documentation content, see https://docs.nvidia.com/nemo/curator/llms-full.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/nemo/curator/_mcp/server.

> Load audio datasets from various sources including FLEURS, custom manifests, and local files

# Load Audio Data

Import audio datasets from various sources into NeMo Curator's audio processing pipeline. Audio data loading supports manifest files, direct file paths, and automated dataset downloads.

## How it Works

Audio data loading in NeMo Curator centers around the `AudioTask` data structure, which contains:

* **Audio file paths**: References to audio files (.wav, .mp3, .flac, and so on)
* **Transcriptions**: Ground truth or reference text for speech content
* **Metadata**: Duration, language, speaker information, and quality metrics

The loading process validates audio file existence and formats data for downstream ASR inference and quality assessment stages.

***

## Loading Methods

Choose the appropriate loading method based on your data source and format:

Automated download and processing of the multilingual FLEURS speech dataset
automated
multilingual
102-languages

Create and load custom audio manifests with file paths and transcriptions
jsonl
tsv
custom-format

Load audio files directly from local directories and file systems
local-storage
batch-processing
file-discovery