NeMo Gym Documentation#

NeMo Gym is a library for building reinforcement learning (RL) training environments for large language models (LLMs). NeMo Gym provides infrastructure to develop environments, scale rollout collection, and integrate seamlessly with your preferred training framework.

A training environment consists of three server components: Agents orchestrate the rollout lifecycle—calling models, executing tool calls through resources, and coordinating verification. Models provide stateless text generation using LLM inference endpoints. Resources define tasks, tool implementations, and verification logic.

Explore Tutorials

Introduction to NeMo Gym#

Understand NeMo Gym’s purpose and core components before diving into tutorials.

About NeMo Gym

Motivation and benefits of NeMo Gym.

motivation benefits

About NVIDIA NeMo Gym

Concepts

Core components, configuration, verification and RL terminology.

agents models resources

Understanding Concepts for NeMo Gym

Ecosystem

Understand how NeMo Gym fits within the NVIDIA NeMo Framework.

nemo-framework

NeMo Gym in the NVIDIA Ecosystem

Get Started#

Install and run NeMo Gym to start collecting rollouts.

Quickstart

Run a training environment and start collecting rollouts in under 5 minutes.

Detailed Setup Guide

Detailed walkthrough of running your first training environment.

environment configuration

Detailed Setup Guide

Rollout Collection

Collect and view rollouts

rollouts training-data

Rollout Collection

Tutorials#

Hands-on tutorials to build and customize your training environments.

Build a Resource Server

Implement or integrate existing tools and define task verification logic.

beginner 30 min custom-environments tools

Creating a Resource Server

Offline Training with Rollouts

Transform rollouts into training data for supervised fine-tuning (SFT) and direct preference optimization (DPO).

sft dpo

Offline Training with Rollouts (SFT/DPO) - Experimental

GRPO with NeMo RL

Learn how to set up NeMo Gym and NeMo RL training environments, run tests, prepare data, and launch single-node and multi-node training runs.

training rl grpo

RL Training with NeMo RL using GRPO

Contribute#

Contribute to NeMo Gym development.

Contribute Environments

Contribute new environments or integrate existing benchmarks.

environments

Contribute Environments

Integrate RL Frameworks

Implement NeMo Gym integration into a new training framework.

training-integration

Training Framework Integration