Understanding Concepts for NeMo Gym#
NeMo Gym concepts explain the mental model behind building RL training environments: when to use RL over SFT, how environment components work together, and how verification signals drive learning. Use this page as a compass to decide which explanation to read next.
Tip
New to RL for LLMs? Start with Training Approaches for context on SFT, RL, and RLVR, or refer to Key Terminology for a quick glossary.
Concept Highlights#
Each explainer below covers one foundational idea and links to deeper material.
Understand the differences between SFT, DPO, and GRPO, and the rise of RLVR.
Understand the three server components that make up a training environment.
Understand how servers are configured and connected.
How components interact during startup and rollout collection.
Understand the importance of verification and common implementation patterns.
Essential vocabulary for agent training, RL workflows, and NeMo Gym.