Important

You are viewing the NeMo 2.0 documentation. This release introduces significant changes to the API and a new library, NeMo Run. We are currently porting all features from NeMo 1.0 to 2.0. For documentation on previous versions or features not yet available in 2.0, please refer to the NeMo 24.07 documentation.

NeMo-Aligner#

Introduction#

NeMo-Aligner is a scalable toolkit for efficient model alignment. The toolkit has support for state-of-the-art model alignment algorithms such as SteerLM, Direct Preference Optimization (DPO), and Reinforcement Learning from Human Feedback (RLHF). These algorithms enable users to align language models to be more safe, harmless, and helpful. Users can perform end-to-end model alignment on a wide range of model sizes and take advantage of all the parallelism techniques to ensure their model alignment is done in a performant and resource-efficient manner. For more technical details, please refer to our paper.

The NeMo-Aligner toolkit is built using the NeMo Toolkit which allows for scaling training up to 1000s of GPUs using tensor, data and pipeline parallelism for all components of alignment. All of our checkpoints are cross-compatible with the NeMo ecosystem, allowing for inference deployment and further customization.

The toolkit is currently in its early stages. We are committed to improving the toolkit to make it easier for developers to pick and choose different alignment algorithms to build safe, helpful, and reliable models.

Get Started#

NeMo-Aligner comes preinstalled in NVIDIA NeMo containers. NeMo containers are launched concurrently with NeMo version updates.

To get access to the container, log in to the NVIDIA GPU Cloud (NGC) platform or create a free NGC account here: NVIDIA NGC. Once you have logged in, you can get the container here: NVIDIA NGC NeMo Framework.

To run interactively using a pre-built container, run the following code:

docker run --rm -it \
  --gpus all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 --shm-size=8g \
  --workdir /opt/NeMo-Aligner \
  nvcr.io/nvidia/nemo:24.09

Please use the latest tag in the form yy.mm.(patch).

Important

  • Some of the subsequent tutorials require accessing gated Hugging Face models. For details on how to access these models, refer to this document.

  • If you run into any problems, refer to NeMo’s Known Issues page. The page enumerates known issues and provides suggested workarounds where appropriate.

Build a NeMo-Aligner Dockerfile#

NeMo-Aligner also provides its own dockerfile if you want to customize the environment. Run the following to build the image:

Prerequisite Obtaining a Pre-Trained Model

This section provides instructions on how to download pre-trained LLMs in .nemo format. The following section will use these base LLMs for further fine-tuning and alignment.

Model Alignment by Supervised Fine-Tuning (SFT)

In this section, we walk you through the most straightforward alignment method. We use a supervised dataset in the prompt-response pairs format to fine-tune the base model according to the desired behavior.

Supervised Fine-Tuning (SFT) with Knowledge Distillation

In this section, we walk through a variation of SFT using Knowledge Distillation where we train a smaller “student” model using a larger “teacher” model.

Model Alignment by DPO, RPO and IPO

DPO, RPO, and IPO are simpler alignment methods compared to RLHF. DPO introduces a novel parameterization of the reward model in RLHF, which allows us to extract the corresponding optimal policy. Similarly, RPO and IPO provide alternative parameterizations or optimization strategies, each contributing unique approaches to refining model alignment.

Model Alignment by RLHF

RLHF is the next step up in alignment and is still responsible for most state-of-the-art chat models. In this section, we walk you through the process of RLHF alignment, including training a reward model and RLHF training with the PPO algorithm.

Model Alignment by SteerLM Method

SteerLM is a novel approach developed by NVIDIA. SteerLM simplifies alignment compared to RLHF. It is based on SFT, but allows user-steerable AI by enabling you to adjust attributes at inference time.

Model Alignment by SteerLM 2.0 Method

SteerLM 2.0 is an extension to SteerLM method that introduces an iterative training procedure to explicitly enforce the generated responses to follow the desired attribute distribution.

Model Alignment by Rejection Sampling (RS)

RS is a simple online alignment algorithm. In RS, the policy model generates several responses. These responses are assigned a score by the reward model, and the highest scoring responses are used for SFT.

Fine-tuning Stable Diffusion with DRaFT+

DRaFT+ is an algorithm for fine-tuning text-to-image generative diffusion models. It achieves this by directly backpropagating through a reward model. This approach addresses the mode collapse issues from the original DRaFT algorithm and improves diversity through regularization.

Constitutional AI: Harmlessness from AI Feedback

CAI, an alignment method developed by Anthropic, enables the incorporation of AI feedback for aligning LLMs. This feedback is grounded in a small set of principles (referred to as the ‘Constitution’) that guide the model toward desired behaviors, emphasizing helpfulness, honesty, and harmlessness.

Algorithm vs. (NLP) Models#

Algorithm

TRTLLM Accelerated

GPT 2B

LLaMA2

LLaMA3

Mistral

Nemotron-4

Mixtral

SFT

Yes (✓)

Yes

Yes

Yes

Yes (✓)

SFT with Knowledge Distillation

Yes (✓)

Yes

Yes

Yes

Yes

DPO

Yes (✓)

Yes

Yes

Yes

Yes (✓)

In active development

RLHF

Yes

Yes

Yes

Yes (✓)

Yes

Yes (✓)

SteerLM

Yes

Yes (✓)

Yes

Yes

Yes

SteerLM 2.0

Yes

Yes

Yes

Yes

Yes

Rejection Sampling

Yes

Yes

Yes

Yes

Yes

CAI

Yes

Yes

Yes

Yes (✓)

Yes

Algorithm vs. (Multimodal) Models#

Algorithm

Stable Diffusion

Draft+

Yes (✓)

Note

  • (✓): Indicates the model is verified to work with the algorithm. Models without this demarcation are expected to work but have not been formally verified yet.

Hardware Requirements#

NeMo-Aligner is powered by other NVIDIA libraries that support several NVIDIA GPUs. NeMo-Aligner is tested on H100 but also works on A100. Several tutorials assume 80GB VRAM, so if you are following along with GPUs with 40GB, adjust your config accordingly.

Examples of config adjustments are increasing node count, introducing more tensor/pipeline parallelism, lowering batch size, and increasing gradient accumulation.