> For clean Markdown of any page, append .md to the page URL. > For a complete documentation index, see https://docs.nvidia.com/nemo/gym/llms.txt. > For full documentation content, see https://docs.nvidia.com/nemo/gym/llms-full.txt. # Concepts NeMo Gym concepts explain the mental model behind building RL training environments: when to use RL over SFT, how environment components work together, and how verification signals drive learning. Use this page as a compass to decide which explanation to read next. New to RL for LLMs? Start with [training-approaches](/v0.2/about/concepts/training-approaches) for context on SFT, RL, and RLVR, or refer to [Key Terminology](/v0.2/about/concepts/key-terminology) for a quick glossary. *** ## Concept Highlights Each explainer below covers one foundational idea and links to deeper material. Understand the differences between SFT, DPO, and GRPO, and the rise of RLVR. Understand the three server components that make up a training environment. Understand how servers are configured and connected. How components interact during startup and rollout collection. Understand the importance of verification and common implementation patterns. Essential vocabulary for agent training, RL workflows, and NeMo Gym. ***