Skip to main content
country_code
Ctrl+K
Try our new NeMo Assist to chat with the NeMo Framework docs and code. We value your feedback to the chatbot responses to help us improve!
NVIDIA NeMo Framework User Guide - Home NVIDIA NeMo Framework User Guide - Home

NVIDIA NeMo Framework User Guide

NVIDIA NeMo Framework User Guide - Home NVIDIA NeMo Framework User Guide - Home

NVIDIA NeMo Framework User Guide

Table of Contents

NeMo Framework

  • Overview

Releases

  • Software Component Versions
  • Changelog
  • Known Issues

Library Documentation

  • Overview
  • NeMo
    • Introduction
    • NeMo Fundamentals
    • Tutorials
    • Mixed Precision Training
    • Parallelisms
    • Mixture of Experts
    • Optimizations
      • Attention Optimizations
      • Activation Recomputation
      • Communication Overlap
      • CPU Offloading
    • Checkpoints
      • NeMo Distributed Checkpoint User Guide
      • Converting from Megatron-LM
    • NeMo APIs
      • NeMo Models
      • Neural Modules
      • Experiment Manager
      • Neural Types
      • Adapters
        • Adapter Components
        • Adapters API
      • NeMo Core APIs
      • NeMo Common Collection API
        • Callbacks
        • Losses
        • Metrics
        • Tokenizers
        • Data
        • S3 Checkpointing
      • NeMo ASR API
      • NeMo TTS API
    • NeMo Collections
      • Automatic Speech Recognition (ASR)
        • Models
        • Datasets
        • ASR Language Modeling and Customization
        • Checkpoints
        • Scores
        • NeMo ASR Configuration Files
        • NeMo ASR API
        • All Checkpoints
        • Example With MCV
      • Speech Classification
        • Models
        • Datasets
        • Checkpoints
        • NeMo Speech Classification Configuration Files
        • Resource and Documentation Guide
      • Speaker Recognition (SR)
        • Models
        • NeMo Speaker Recognition Configuration Files
        • Datasets
        • Checkpoints
        • NeMo Speaker Recognition API
        • Resource and Documentation Guide
      • Speaker Diarization
        • Models
        • Datasets
        • Checkpoints
        • End-to-End Speaker Diarization Configuration Files
        • NeMo Speaker Diarization API
        • Resource and Documentation Guide
      • Speech Self-Supervised Learning
        • Models
        • Datasets
        • Checkpoints
        • NeMo SSL Configuration Files
        • NeMo SSL collection API
        • Resources and Documentation
      • Speech Intent Classification and Slot Filling
        • Models
        • Datasets
        • Checkpoints
        • NeMo Speech Intent Classification and Slot Filling Configuration Files
        • NeMo Speech Intent Classification and Slot Filling collection API
        • Resources and Documentation
      • SpeechLM2
        • Models
        • Datasets
        • Configuration Files
        • Training and Scaling
      • Text-to-Speech (TTS)
        • Models
        • Data Preprocessing
        • Checkpoints
        • NeMo TTS Configuration Files
        • Grapheme-to-Phoneme Models
        • Magpie-TTS
        • Magpie-TTS Preference Optimization
        • Magpie-TTS Longform Inference
      • Speech and Audio Processing
        • Models
        • Datasets
        • Checkpoints
        • NeMo Audio Configuration Files
        • NeMo Audio API
    • Speech AI Tools
      • NeMo Forced Aligner (NFA)
      • Dataset Creation Tool Based on CTC-Segmentation
      • Speech Data Explorer
      • Comparison tool for ASR Models
      • ASR Evaluator
      • Speech Data Processor
  • NeMo AutoModel
  • NeMo Curator
  • NeMo Evaluator
  • NeMo Export and Deploy
  • NeMo Megatron Bridge
  • NeMo RL
  • NeMo Run

Getting Started

  • Tutorials

Model Optimization

  • Quantization
  • Pruning
  • Distillation
  • Speculative Decoding

Models

  • Speech AI Models
    • Magpie-TTS
    • Magpie-TTS Preference Optimization
    • Magpie-TTS Longform Inference
  • NVIDIA NeMo Framework Developer Docs
  • Speech AI Tools
  • Speech Data Processor
Is this page helpful?

Speech Data Processor#

Speech Data Processor (SDP) is a toolkit to make it easy to:
  1. write code to process a new dataset, minimizing the amount of boilerplate code required.

  2. share the steps for processing a speech dataset.

SDP is hosted here: NVIDIA/NeMo-speech-data-processor.

To learn more about SDP, please check the [documentation](https://nvidia.github.io/NeMo-speech-data-processor/).

previous

ASR Evaluator

next

Tutorials

NVIDIA NVIDIA
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2023-2026, NVIDIA Corporation.

Last updated on Apr 13, 2026.