Skip to main content
country_code
Ctrl+K
This archived material is no longer maintained, and its links or information may be outdated or pose security risks. Visit the NVIDIA Docs Hub for the latest and most up-to-date content.
NVIDIA NeMo Framework User Guide - Home NVIDIA NeMo Framework User Guide - Home

NVIDIA NeMo Framework User Guide

NVIDIA NeMo Framework User Guide - Home NVIDIA NeMo Framework User Guide - Home

NVIDIA NeMo Framework User Guide

Table of Contents

NeMo Framework

  • Overview
  • Performance
  • Why NeMo Framework?

Quickstart

  • Getting Started
  • Playbooks
    • Run NeMo Framework on DGX Cloud
    • Run NeMo Framework on Kubernetes
    • NeMo Framework SFT with Llama 2
    • NeMo Framework SFT with Mistral-7B
    • NeMo Framework SFT with Mixtral-8x7B and Nemotron 4 340B
    • NeMo Framework PEFT with Mistral-7B
    • NeMo Framework PEFT with Llama2, Mixtral-8x7B, and Nemotron 4 340B
    • NeMo Framework Foundation Model Pre-training
    • NeMo Framework AutoConfigurator
    • NeMo Framework Single Node Pre-training
    • NeMo Framework Post-Training Quantization (PTQ) with Nemotron4 and Llama3
    • NeMo Framework Quantization Aware Training (QAT) for Llama2 SFT Model

NeMo 2.0

  • Overview
  • Quickstart with NeMo-Run
  • Migration Guide
    • Pre-Training
    • SFT Training and Inference
    • PEFT Training and Inference
    • Trainer Configuration
    • Experiment Manager
    • Checkpointing Configurations
    • Optimizer Configuration
    • Data Configuration
    • Nsys Profiling
    • Tokenizers
  • Feature Guide
    • The Bridge Between Lightning and Megatron Core
    • Logging and Checkpointing
    • Serialization
    • Parameter Efficient Fine-Tuning (PEFT)
    • Hugging Face Integration
  • Large Language Models
    • Llama
    • Mixtral
    • Nemotron
  • Long Context Recipe
    • Context Parallelism
    • Access Long Context Recipe

Training and Customization

  • SFT and PEFT
    • Developer Quick Start
    • Supported PEFT Methods
    • SFT and PEFT Examples
    • NeMo QLoRA Guide
  • Continual Learning with Pretrained Checkpoints
  • RAG
    • RAG Pipeline Overview
    • Index Corpus Data for RAG
    • Generate Text with RAG
  • Optimizing Models with Pruning
  • Optimizing Models with Knowledge Distillation

Models

  • Large Language Models
    • Common
      • Prepare Environment
      • Data Preparation
      • Monitoring and Logging
      • Auto Configurator
      • Resuming Training with a Different Number of Nodes
      • Continual Learning with Pretrained Checkpoints
      • Deploy NeMo Large Language Models
        • Deploy NeMo Models by Exporting to Inference Optimized Libraries
        • Deploy NeMo Models in the Framework
        • Send Queries to the NVIDIA Triton Server for NeMo LLMs
    • Llama and CodeLlama
      • Data Preparation
      • Train with Predefined Configurations
      • Checkpoint Conversion
      • Model Evaluation
      • Parameter Efficient Fine-Tuning (PEFT)
    • Gemma and CodeGemma
      • Data Preparation for SFT and PEFT
      • Checkpoint Conversion
      • Supervised Fine-tuning (SFT)
      • Parameter Efficient Fine-Tuning (PEFT)
    • Griffin (Recurrent Gemma)
      • Data Preparation for SFT and PEFT
      • Checkpoint Conversion
      • Supervised Fine-tuning (SFT)
      • Parameter Efficient Fine-Tuning (PEFT)
    • Baichuan 2
      • Data Preparation
      • Training with Predefined Configurations
      • Checkpoint Conversion
      • Model Evaluation
      • Parameter Efficient Fine-Tuning (PEFT)
    • Falcon
      • Falcon
      • Data Preparation
      • Training with Predefined Configurations
      • Model Evaluation
      • Parameter Efficient Fine-Tuning (PEFT)
    • Mamba2 and Hybrid Models
      • Data Preparation for SFT
      • Checkpoint Conversion
      • Supervised Fine-Tuning (SFT)
    • Mistral
      • Data Preparation
      • Training with Predefined Configurations
      • Checkpoint Conversion
      • Model Evaluation
      • Parameter Efficient Fine-Tuning (PEFT)
    • Mixtral
      • Data Preparation
      • Training with Predefined Configurations
      • Checkpoint Conversion
      • Model Evaluation
      • Parameter Efficient Fine-Tuning (PEFT)
    • Nemotron
      • Nemotron
      • Data Preparation
      • Training with Predefined Configurations
      • Model Evaluation
      • Parameter Efficient Fine-Tuning (PEFT)
      • Supervised Fine-Tuning (SFT)
    • StarCoder2
      • StarCoder2
      • Data Preparation for SFT and PEFT
      • Checkpoint Conversion
      • Model Evaluation
      • Supervised Fine-tuning (SFT)
      • Parameter Efficient Fine-Tuning (PEFT)
    • T5
      • Data Preparation
      • Training
      • Training with Predefined Configurations
      • Changing Embedding Type
      • Checkpoint Conversion
      • Model Evaluation
      • Parameter-Efficient Fine-Tuning (PEFT)
      • Model Fine-Tuning
      • T5 Results
    • mT5
      • Data Preparation
      • Training
      • Training with Predefined Configurations
      • Checkpoint Conversion
      • Model Evaluation
      • PEFT Training and Inference
      • Model Fine-Tuning
      • MT5 Results
    • GPT
      • Data Preparation
      • Model Training
      • Training with Predefined Configurations
      • Checkpoint Conversion
      • Model Evaluation
      • Parameter Efficient Fine-Tuning (PEFT)
      • Fully Sharded Data Parallel (FSDP)
      • Torch Distributed Checkpoint (TDC)
      • GPT Results
    • BERT
      • Data Preparation
      • Model Training
      • Training with Predefined Configurations
      • BERT Results
    • ChatGLM
      • Data Preparation
      • Training with Predefined Configurations
      • Checkpoint Conversion
      • Model Evaluation
      • Parameter Efficient Fine-Tuning (PEFT)
    • RETRO
      • Data Preparation
      • Training
      • Training with Predefined Configurations
      • Model Inferencing
      • Model Evaluation
    • Tokenizer
      • SentencePiece Tokenizer
        • Installation
        • Training
        • NeMo Usage
  • Multimodal Models
    • Multimodal Language Models
      • NeVA (LLaVA)
        • Training Preparation
        • Training with Predefined Configurations
        • Fine-Tuning
        • Parameter-Efficient Fine-Tuning (PEFT)
        • Framework Inference
        • Model Export to TensorRT-LLM
        • Model Deployment
        • Performance
      • VideoNeVA
        • Data Preparation
        • Training with Predefined Configurations
        • Framework Inference
        • Performance
        • LITA
    • Vision-Language Foundation Models
      • CLIP
        • Data Preparation
        • Training with Predefined Configurations
        • Evaluation
        • Framework Inference
        • Performance
      • Vision Transformer
        • Data Preparation
        • Vision Transformer Training
        • Fine-tuning
        • Evaluation
        • Framework Inference
        • Performance
      • NSFW Content Filter
        • Data Preparation
        • Fine-tuning
        • Framework Inference
        • Performance
    • Text-to-Image Models
      • Stable Diffusion and SDXL
        • Data Preparation
        • Training with Predefined Configurations
        • Run NeMo Framework Inference
        • Performance
      • Dreambooth
        • Data Preparation
        • Training with Predefined Configurations
        • Framework Inference
        • Performance
      • ControlNet
        • Data Preparation
        • Training with Predefined Configurations
        • Framework Inference
        • Performance
      • InstructPix2Pix
        • Data Preparation
        • Training with Predefined Configurations
        • Framework Inference
        • Performance
      • Imagen
        • Data Preparation
        • Training with Predefined Configurations
        • Framework Inference
        • Performance
    • Beyond 2D generation using NeRF
      • DreamFusion
        • DreamFusion-DMTet
        • Data Preparation
        • Training with Predefined Configurations
        • Performance
  • Embedding Models
    • BERT Embedding Models
    • GPT Embedding Models
  • Speech AI Models

Deploy Models

  • Overview
  • NeMo Large Language Models
    • Deploy NeMo Models by Exporting to Inference Optimized Libraries
      • Deploy NeMo Models by Exporting TensorRT-LLM
      • Deploy NeMo Models by Exporting vLLM
    • Deploy NeMo Models in the Framework
    • Send Queries to the NVIDIA Triton Server for NeMo LLMs
  • NeMo Multimodal Models

Library Documentation

  • Overview
  • NeMo
    • Introduction
    • NeMo Fundamentals
    • Tutorials
    • Mixed Precision Training
    • Parallelisms
    • Mixture of Experts
    • Optimizations
      • Attention Optimizations
      • Sequence Packing
      • Activation Recomputation
      • Communication Overlap
      • CPU Offloading
    • Checkpoints
      • Distributed Checkpoints
      • Community Model Converter User Guide
      • Community Model Converter Development Guide
      • Converting from Megatron-LM
    • NeMo APIs
      • NeMo Models
      • Neural Modules
      • Experiment Manager
      • Neural Types
      • Exporting NeMo Models
      • Adapters
        • Adapter Components
        • Adapters API
      • NeMo Core APIs
      • NeMo Common Collection API
        • Callbacks
        • Losses
        • Metrics
        • Tokenizers
        • Data
        • S3 Checkpointing
      • NeMo Large language Model API
        • ONNX Export of Megatron Models
      • NeMo Multimodal API
      • NeMo ASR API
      • NeMo TTS API
    • NeMo Collections
      • Large Language Models
        • GPT model training
        • Batching
        • RETRO Model
        • Hiddens Module
        • Parameter-Efficient Fine-Tuning (PEFT)
        • Positional embeddings
        • Megatron Core Customization
      • Machine Translation Models
      • ONNX Export of Megatron Models
      • Quantization
      • Multimodal Language Models
        • Multimodal Language Model Datasets
        • Common Configuration Files
        • NeVA
        • Video NeVA
        • Sequence Packing for NeVA
        • Speech-agumented Large Language Models (SpeechLLM)
        • SpechLLM Dataset
        • Common Configuration Files
        • SpeechLLM API
      • Vision-Language Foundation
        • Datasets
        • Common Configuration Files
        • Checkpoints
        • CLIP
      • Text to Image Models
        • Datasets
        • Common Configuration Files
        • Checkpoints
        • Stable Diffusion
        • Imagen
        • DreamBooth
        • ControlNet
        • InstructPix2Pix
        • Stable Diffusion XL Int8 Quantization
      • NeRF
        • Datasets
        • Common Configuration Files
        • DreamFusion
      • Vision Models
        • Datasets
        • Common Configuration Files
        • Checkpoints
        • ViT
      • Automatic Speech Recognition (ASR)
        • Models
        • Datasets
        • ASR Language Modeling and Customization
        • Checkpoints
        • Scores
        • NeMo ASR Configuration Files
        • NeMo ASR API
        • Example With MCV
      • Speech Classification
        • Models
        • Datasets
        • Checkpoints
        • NeMo Speech Classification Configuration Files
        • Resource and Documentation Guide
      • Speaker Recognition (SR)
        • Models
        • NeMo Speaker Recognition Configuration Files
        • Datasets
        • Checkpoints
        • NeMo Speaker Recognition API
        • Resource and Documentation Guide
      • Speaker Diarization
        • Models
        • Datasets
        • Checkpoints
        • NeMo Speaker Diarization Configuration Files
        • NeMo Speaker Diarization API
        • Resource and Documentation Guide
      • Speech Self-Supervised Learning
        • Models
        • Datasets
        • Checkpoints
        • NeMo SSL Configuration Files
        • NeMo SSL collection API
        • Resources and Documentation
      • Speech Intent Classification and Slot Filling
        • Models
        • Datasets
        • Checkpoints
        • NeMo Speech Intent Classification and Slot Filling Configuration Files
        • NeMo Speech Intent Classification and Slot Filling collection API
        • Resources and Documentation
      • Text-to-Speech (TTS)
        • Models
        • Data Preprocessing
        • Checkpoints
        • NeMo TTS Configuration Files
        • Grapheme-to-Phoneme Models
    • Speech AI Tools
      • NeMo Forced Aligner (NFA)
      • Dataset Creation Tool Based on CTC-Segmentation
      • Speech Data Explorer
      • Comparison tool for ASR Models
      • ASR Evaluator
      • Speech Data Processor
      • (Inverse) Text Normalization
        • WFST-based (Inverse) Text Normalization
        • Neural Models for (Inverse) Text Normalization
  • NeMo Framework Launcher
    • Launcher Introduction
    • Quickstart Guide for NeMo Launcher
    • NeMo Launcher Tutorial
      • NeMo Launcher Tutorial
      • Understand the Configurations
      • Launch Experiment with NeMo Launcher
      • Logs and Results
      • Launcher Multirun
      • Debugging
  • NeMo Aligner
    • Obtain a Pretrained Model
    • Model Alignment by RLHF
    • Model Alignment by SteerLM Method
    • SteerLM 2.0: Iterative Training for Attribute-Conditioned Language Model Alignment
    • Model Alignment by Direct Preference Optimization (DPO)
    • Model Alignment by Self-Play Fine-Tuning (SPIN)
    • Fine-tuning Stable Diffusion with DRaFT+
    • Constitutional AI: Harmlessness from AI Feedback
  • NeMo Curator
    • Download and Extract Text
    • Working with DocumentDataset
    • CPU and GPU Modules with Dask
    • Classifier and Heuristic Quality Filtering
    • Language Identification and Unicode Fixing
    • GPU Accelerated Exact and Fuzzy Deduplication
    • Semantic Deduplication
    • Synthetic Data Generation
    • Downstream Task Decontamination/Deduplication
    • PII Identification and Removal
    • Distributed Data Classification
    • Running NeMo Curator on Kubernetes
    • Best Practices
    • Next Steps
    • API Reference
      • Dask Cluster Functions
      • Datasets
      • Download and Extract
      • Filters
      • Classifiers
      • Modifiers
      • Deduplication
      • Task Decontamination
      • LLM Services
      • Synthetic Data
      • Miscellaneous

Cloud Service Providers

  • Overview

Releases

  • Software Component Versions
  • Changelog
  • Known Issues
  • NVIDIA NeMo Framework Developer Docs
  • NeMo APIs
  • NeMo Common Collection API
Is this page helpful?

Important

NeMo 2.0 is an experimental feature and currently released in the dev container only: nvcr.io/nvidia/nemo:dev. Please refer to NeMo 2.0 overview for information on getting started.

NeMo Common Collection API#

The common collection contains things that could be used across all collections.

  • Callbacks
    • Exponential Moving Average (EMA)
  • Losses
    • AggregatorLoss
      • __init__()
    • CrossEntropyLoss
      • __init__()
    • MSELoss
      • __init__()
    • SmoothedCrossEntropyLoss
      • __init__()
    • SpanningLoss
      • __init__()
  • Metrics
    • Perplexity
      • compute()
      • full_state_update
      • update()
  • Tokenizers
    • AutoTokenizer
      • __init__()
    • SentencePieceTokenizer
      • __init__()
    • TokenizerSpec
      • __init__()
  • Data
    • ConcatDataset
      • get_iterable()
      • random_generator()
      • round_robin_generator()
      • temperature_generator()
    • ConcatMapDataset
  • S3 Checkpointing
    • S3CheckpointIO
    • S3Utils and Dependencies
    • s3_dirpath_utils
    • S3 Demands and ExpManager Details When Running at Scale

previous

NeMo Core APIs

next

Callbacks

NVIDIA NVIDIA
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2023-2026, NVIDIA Corporation.

Last updated on Jan 13, 2026.