Training Tutorials

We have hands-on tutorials with supported training frameworks to help you train with NeMo Gym environments. If you’re interested in integrating another training framework, see the Training Framework Integration Guide.

See Training for a refresher on when to use GRPO, SFT, or DPO.

RL (GRPO)

NeMo RL

Tutorial-series: GRPO training to improve multi-step tool calling on the Workplace Assistant environment, scaling from single-node to multi-node training.

nemo rlgrpo3-5 hours

OpenRLHF

Review the agent executor for using NeMo Gym environments with OpenRLHF.

openrlhf

Unsloth

Example GRPO training on instruction following and reasoning environments.

unslothsingle-gpu30 min

VeRL

Example DAPO training on math and agentic environments using VeRL, with single and multi-environment support.

verldapomulti-node1 hour