Important

You are viewing the NeMo 2.0 documentation. This release introduces significant changes to the API and a new library, NeMo Run. We are currently porting all features from NeMo 1.0 to 2.0. For documentation on previous versions or features not yet available in 2.0, please refer to the NeMo 24.07 documentation.

Checkpoints#

There are two main ways to load pretrained checkpoints in NeMo:

  • Using the restore_from() method to load a local checkpoint file (.nemo), or

  • Using the from_pretrained() method to download and set up a checkpoint from the cloud.

Note that these instructions are for loading fully trained checkpoints for evaluation or fine-tuning. For resuming an unfinished training experiment, use the Experiment Manager to do so by setting the resume_if_exists flag to True.

Local Checkpoints#

  • Save Model Checkpoints: NeMo automatically saves final model checkpoints with .nemo suffix. You could also manually save any model checkpoint using model.save_to(<checkpoint_path>.nemo).

  • Load Model Checkpoints: if you’d like to load a checkpoint saved at <path/to/checkpoint/file.nemo>, use the restore_from() method below, where <MODEL_BASE_CLASS> is the model class of the original checkpoint.

import nemo.collections.audio as nemo_audio
model = nemo_audio.models.<MODEL_BASE_CLASS>.restore_from(restore_path="<path/to/checkpoint/file.nemo>")

Pretrained Checkpoints#

The table below in Checkpoints list part of available pre-trained audio processing models including speech processing, restoration and extraction.

Load Model Checkpoints#

The models can be accessed via the from_pretrained() method inside the audio model class. In general, you can load any of these models with code in the following format,

import nemo.collections.audio as nemo_audio
model = nemo_audio.models.<MODEL_BASE_CLASS>.from_pretrained(model_name="<MODEL_NAME>")

where <MODEL_NAME> is the value in Model Name column in the tables in Checkpoints. These names are predefined in the each model’s member function self.list_available_models().

Audio Models#

Speech Enhancement Models#

Model Name

Dataset

Sampling Rate

Model Class

Model Card

nvidia/se_den_sb_16k_small

WSJ0+CHiME

16000Hz

nemo.collections.audio.models.SchroedingerBridgeAudioToAudioModel

se_den_sb_16k_small

nvidia/se_der_sb_16k_small

WSJ0+Reverb

16000Hz

nemo.collections.audio.models.SchroedingerBridgeAudioToAudioModel

se_der_sb_16k_small

SSL Models#

Model Name

Dataset

Sampling Rate

Model Class

Model Card

nvidia/sr_ssl_flowmatching_16k_430m

Libri-Light

16000Hz

nemo.collections.audio.models.FlowMatchingAudioToAudioModel

sr_ssl_flowmatching_16k_430m