Installation#

This page covers how to install NVIDIA NeMo for speech AI tasks (ASR, TTS, speaker tasks, audio processing, and speech language models).

Prerequisites#

Before installing NeMo, ensure you have:

  1. Python 3.12 or above

  2. PyTorch 2.7+ (install before NeMo so CUDA wheels match your GPU driver)

  3. NVIDIA GPU (required for training; CPU-only inference is possible but slow)

Install from PyPI#

The quickest way to install NeMo is via pip. Install only the collections you need:

# Install ASR and TTS (most common)
pip install nemo_toolkit[asr,tts]

# Install everything speech-related
pip install nemo_toolkit[asr,tts,audio]

Available extras:

Extra

What it includes

asr

Automatic Speech Recognition models, data loaders, and utilities

tts

Text-to-Speech models, vocoders, and audio codecs

audio

Audio processing models (enhancement, separation)

Install from Source#

For the latest development version or if you plan to contribute, clone the repository and install in editable mode.

The test extra pulls in pytest and tooling for the test suite. It does not install NeMo collection dependencies (ASR, TTS, audio, etc.). Add those extras explicitly or imports like nemo.collections.asr will fail.

git clone https://github.com/NVIDIA/NeMo.git
cd NeMo

# After PyTorch is installed (see Recommended installation order above):
# Collections you need for development (required for nemo.collections.* imports)
pip install -e '.[asr,tts]'

# Optional: add test to run pytest with NeMo’s dev test dependencies
# pip install -e '.[asr,tts,test]'

Using Docker#

NVIDIA provides Docker containers with NeMo pre-installed. Check the NeMo GitHub releases for the latest container tags.

Verify Installation#

After installing, verify that NeMo is working:

import nemo.collections.asr as nemo_asr
print("NeMo ASR installed successfully!")

# Quick test: load a pretrained model
model = nemo_asr.models.ASRModel.from_pretrained("nvidia/parakeet-tdt-0.6b-v2")
print(f"Model loaded: {model.__class__.__name__}")

What’s Next?#