Skip to main content
country_code
Ctrl+K
Try our new NeMo Assist to chat with the NeMo Framework docs and code. We value your feedback to the chatbot responses to help us improve!
NVIDIA NeMo Framework User Guide - Home NVIDIA NeMo Framework User Guide - Home

NVIDIA NeMo Framework User Guide

NVIDIA NeMo Framework User Guide - Home NVIDIA NeMo Framework User Guide - Home

NVIDIA NeMo Framework User Guide

Table of Contents

NeMo Framework

  • Overview

Releases

  • Software Component Versions
  • Changelog
  • Known Issues

Library Documentation

  • Overview
  • NeMo
    • Introduction
    • NeMo Fundamentals
    • Tutorials
    • Mixed Precision Training
    • Parallelisms
    • Mixture of Experts
    • Optimizations
      • Attention Optimizations
      • Activation Recomputation
      • Communication Overlap
      • CPU Offloading
    • Checkpoints
      • NeMo Distributed Checkpoint User Guide
      • Converting from Megatron-LM
    • NeMo APIs
      • NeMo Models
      • Neural Modules
      • Experiment Manager
      • Neural Types
      • Adapters
        • Adapter Components
        • Adapters API
      • NeMo Core APIs
      • NeMo Common Collection API
        • Callbacks
        • Losses
        • Metrics
        • Tokenizers
        • Data
        • S3 Checkpointing
      • NeMo ASR API
      • NeMo TTS API
    • NeMo Collections
      • Automatic Speech Recognition (ASR)
        • Models
        • Datasets
        • ASR Language Modeling and Customization
        • Checkpoints
        • Scores
        • NeMo ASR Configuration Files
        • NeMo ASR API
        • All Checkpoints
        • Example With MCV
      • Speech Classification
        • Models
        • Datasets
        • Checkpoints
        • NeMo Speech Classification Configuration Files
        • Resource and Documentation Guide
      • Speaker Recognition (SR)
        • Models
        • NeMo Speaker Recognition Configuration Files
        • Datasets
        • Checkpoints
        • NeMo Speaker Recognition API
        • Resource and Documentation Guide
      • Speaker Diarization
        • Models
        • Datasets
        • Checkpoints
        • End-to-End Speaker Diarization Configuration Files
        • NeMo Speaker Diarization API
        • Resource and Documentation Guide
      • Speech Self-Supervised Learning
        • Models
        • Datasets
        • Checkpoints
        • NeMo SSL Configuration Files
        • NeMo SSL collection API
        • Resources and Documentation
      • Speech Intent Classification and Slot Filling
        • Models
        • Datasets
        • Checkpoints
        • NeMo Speech Intent Classification and Slot Filling Configuration Files
        • NeMo Speech Intent Classification and Slot Filling collection API
        • Resources and Documentation
      • SpeechLM2
        • Models
        • Datasets
        • Configuration Files
        • Training and Scaling
      • Text-to-Speech (TTS)
        • Models
        • Data Preprocessing
        • Checkpoints
        • NeMo TTS Configuration Files
        • Grapheme-to-Phoneme Models
        • Magpie-TTS
        • Magpie-TTS Preference Optimization
        • Magpie-TTS Longform Inference
      • Speech and Audio Processing
        • Models
        • Datasets
        • Checkpoints
        • NeMo Audio Configuration Files
        • NeMo Audio API
    • Speech AI Tools
      • NeMo Forced Aligner (NFA)
      • Dataset Creation Tool Based on CTC-Segmentation
      • Speech Data Explorer
      • Comparison tool for ASR Models
      • ASR Evaluator
      • Speech Data Processor
  • NeMo AutoModel
  • NeMo Curator
  • NeMo Evaluator
  • NeMo Export and Deploy
  • NeMo Megatron Bridge
  • NeMo RL
  • NeMo Run

Getting Started

  • Tutorials

Model Optimization

  • Quantization
  • Pruning
  • Distillation
  • Speculative Decoding

Models

  • Speech AI Models
    • Magpie-TTS
    • Magpie-TTS Preference Optimization
    • Magpie-TTS Longform Inference
  • NVIDIA NeMo Framework Developer Docs
  • Speech AI Tools
Is this page helpful?

Speech AI Tools#

NeMo provides a set of tools useful for developing Automatic Speech Recognitions (ASR) and Text-to-Speech (TTS) synthesis models: NVIDIA/NeMo .

  • NeMo Forced Aligner (NFA)
  • Dataset Creation Tool Based on CTC-Segmentation
  • Speech Data Explorer
  • Comparison tool for ASR Models
  • ASR Evaluator

There are also additional NeMo-related tools hosted in separate github repositories:

  • Speech Data Processor

previous

NeMo Audio API

next

NeMo Forced Aligner (NFA)

NVIDIA NVIDIA
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2023-2026, NVIDIA Corporation.

Last updated on Apr 13, 2026.