Skip to main content
Ctrl+K
NVIDIA NeMo Framework User Guide - Home

NVIDIA NeMo Framework User Guide

NVIDIA NeMo Framework User Guide - Home

NVIDIA NeMo Framework User Guide

Table of Contents

NeMo Framework

  • Overview
  • Install NeMo Framework
  • Performance
  • Why NeMo Framework?

Releases

  • Software Component Versions
  • Changelog
  • Known Issues

Getting Started

  • Quickstart with NeMo-Run
  • Quickstart with NeMo 2.0 API
  • Tutorials

Developer Guides

  • Migration Guide
    • Pre-Training
    • SFT Training and Inference
    • PEFT Training and Inference
    • Trainer Configuration
    • Precision Configuration
    • Parallelisms
    • Experiment Manager
    • Checkpointing Configurations
    • Optimizer Configuration
    • Data Configuration
    • Nsys Profiling
    • Tokenizers
  • Feature Guide
    • The Link Between Lightning and Megatron Core
    • Logging and Checkpointing
    • Serialization
    • Parameter Efficient Fine-Tuning (PEFT)
    • Hugging Face Integration
    • Profiling
    • Introduction
  • Best Practices
  • Performance Tuning Guide

Training and Customization

  • Long Context Training
    • Context Parallelism
  • Optimal Configuration with Auto Configurator
  • Parameter-Efficient Fine-tuning (PEFT)
    • Supported PEFT Methods
    • A Comparison of Performant and Canonical LoRA Variants
  • Sequence Packing
  • Resiliency
  • Continual Training
  • Custom Datasets
    • Pre-Training Data Module
    • Fine-Tuning Data Module

Model Optimization

  • Quantization
  • Pruning
  • Distillation
  • Speculative Decoding

Models

  • Large Language Models
    • Baichuan 2
    • ChatGLM 3
    • DeepSeek V2
    • DeepSeek V3
    • Gemma
    • Gemma 2
    • GPT-OSS
    • Hyena
    • Llama 3
    • Llama Nemotron
    • Mamba 2
    • Mixtral
    • Nemotron
    • Phi 3
    • Qwen2/2.5
    • Qwen3
    • Starcoder
    • Starcoder 2
    • T5
    • BERT
  • Vision Language Models
    • NeVA (LLaVA)
    • LLaVA-Next
    • Llama 3.2 Vision Models
    • Llama 4 Models
    • Qwen2-VL
    • Gemma 3 Models
    • Data Preparation to Use Megatron-Energon Dataloader
    • CLIP
    • Llama Nemotron Nano VL 8B
    • Audio-Vision Language Model
  • Speech AI Models
  • Diffusion Models
    • Flux
    • Diffusion Training Framework
  • Embedding Models
    • SBERT
    • Llama Embedding
    • Exporting Llama Embedding To ONNX and TensorRT
  • Reranker Models
    • Llama Reranker

Library Documentation

  • Overview
  • NeMo
    • Introduction
    • NeMo Fundamentals
    • Tutorials
    • Mixed Precision Training
    • Parallelisms
    • Mixture of Experts
    • Optimizations
      • Attention Optimizations
      • Activation Recomputation
      • Communication Overlap
      • CPU Offloading
    • Checkpoints
      • NeMo Distributed Checkpoint User Guide
      • Converting from Megatron-LM
    • NeMo APIs
      • NeMo Models
      • Neural Modules
      • Experiment Manager
      • Neural Types
      • Adapters
        • Adapter Components
        • Adapters API
      • NeMo Core APIs
      • NeMo Common Collection API
        • Callbacks
        • Losses
        • Metrics
        • Tokenizers
        • Data
        • S3 Checkpointing
      • NeMo ASR API
      • NeMo TTS API
    • NeMo Collections
      • Large Language Models
        • GPT Model Training
        • Batching
        • Positional embeddings
        • Megatron Core Customization
        • Reset Learning Rate
        • Ramp Up Batch Size
      • Machine Translation Models
      • Automatic Speech Recognition (ASR)
        • Models
        • Datasets
        • ASR Language Modeling and Customization
        • Checkpoints
        • Scores
        • NeMo ASR Configuration Files
        • NeMo ASR API
        • All Checkpoints
        • Example With MCV
      • Speech Classification
        • Models
        • Datasets
        • Checkpoints
        • NeMo Speech Classification Configuration Files
        • Resource and Documentation Guide
      • Speaker Recognition (SR)
        • Models
        • NeMo Speaker Recognition Configuration Files
        • Datasets
        • Checkpoints
        • NeMo Speaker Recognition API
        • Resource and Documentation Guide
      • Speaker Diarization
        • Models
        • Datasets
        • Checkpoints
        • End-to-End Speaker Diarization Configuration Files
        • NeMo Speaker Diarization API
        • Resource and Documentation Guide
      • Speech Self-Supervised Learning
        • Models
        • Datasets
        • Checkpoints
        • NeMo SSL Configuration Files
        • NeMo SSL collection API
        • Resources and Documentation
      • Speech Intent Classification and Slot Filling
        • Models
        • Datasets
        • Checkpoints
        • NeMo Speech Intent Classification and Slot Filling Configuration Files
        • NeMo Speech Intent Classification and Slot Filling collection API
        • Resources and Documentation
      • SpeechLM2
        • Models
        • Datasets
        • Configuration Files
        • Training and Scaling
      • Text-to-Speech (TTS)
        • Models
        • Data Preprocessing
        • Checkpoints
        • NeMo TTS Configuration Files
        • Grapheme-to-Phoneme Models
      • Speech and Audio Processing
        • Models
        • Datasets
        • Checkpoints
        • NeMo Audio Configuration Files
        • NeMo Audio API
    • Speech AI Tools
      • NeMo Forced Aligner (NFA)
      • Dataset Creation Tool Based on CTC-Segmentation
      • Speech Data Explorer
      • Comparison tool for ASR Models
      • ASR Evaluator
      • Speech Data Processor
      • (Inverse) Text Normalization
        • WFST-based (Inverse) Text Normalization
        • Neural Models for (Inverse) Text Normalization
  • NeMo AutoModel
  • NeMo Curator
  • NeMo Eval
  • NeMo Export and Deploy
  • NeMo Megatron Bridge
  • NeMo RL
  • NeMo Run
  • NVIDIA NeMo Framework Developer Docs
  • Speech AI Tools

Speech AI Tools#

NeMo provides a set of tools useful for developing Automatic Speech Recognitions (ASR) and Text-to-Speech (TTS) synthesis models: NVIDIA/NeMo .

  • NeMo Forced Aligner (NFA)
  • Dataset Creation Tool Based on CTC-Segmentation
  • Speech Data Explorer
  • Comparison tool for ASR Models
  • ASR Evaluator

There are also additional NeMo-related tools hosted in separate github repositories:

  • Speech Data Processor
  • (Inverse) Text Normalization

previous

NeMo Audio API

next

NeMo Forced Aligner (NFA)

NVIDIA NVIDIA
Privacy Policy | Manage My Privacy | Do Not Sell or Share My Data | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2023-2025, NVIDIA Corporation.

Last updated on Nov 24, 2025.