Important

NeMo 2.0 is an experimental feature and currently released in the dev container only: nvcr.io/nvidia/nemo:dev. Please refer to NeMo 2.0 overview for information on getting started.

Nemotron

Nemotron is a Large Language Model (LLM) that can be integrated into a synthetic data generation pipeline to produce training data, assisting researchers and developers in building their own LLMs.

The following examples use NeMo Framework Launcher, which provides a user-friendly interface to build end-to-end workflows for model development. To get started please follow the Installation Steps, and start the NeMo Framework container ensuring the necessary launcher and any data folders are mounted.

All the config scripts that you’ll use in the examples are located in NeMo-Framework-Launcher/launcher_scripts.

We provide playbooks to showcase NeMo features including PEFT, SFT, and deployment with PTQ:

Note

If you are using NeMo Framework container version <=24.05, make sure to mount the latest NeMo-Framework-Launcher to have the correct Nemotron config for your workflow. See instructions below:

  1. Clone the latest NeMo-Framework-Launcher:

    git clone git@github.com:NVIDIA/NeMo-Framework-Launcher.git
    
  2. Launch the docker container mounted with the above repository:

    docker run --gpus all -it --rm --shm-size=4g -p 8000:8000 -v ${PWD}/NeMo-Framework-Launcher:NeMo-Framework-Launcher  nvcr.io/nvidia/nemo:version
    

Feature

Status

Data parallelism

Tensor parallelism

Pipeline parallelism

Interleaved Pipeline Parallelism Sched

N/A

Sequence parallelism

Selective activation checkpointing

Gradient checkpointing

Partial gradient checkpointing

FP32/TF32

AMP/FP16

BF16

TransformerEngine/FP8

Multi-GPU

Multi-Node

Inference

Slurm

Base Command Manager

Base Command Platform

Distributed data preprcessing

NVfuser

P-Tuning and Prompt Tuning

IA3 and Adapter learning

Distributed Optimizer

Distributed Checkpoint

Fully Shared Data Parallel

N/A