Algorithms#
NeMo RL supports multiple training algorithms for post-training large language models.
Support Matrix#
Algorithms |
Single Node |
Multi-node |
|---|---|---|
DAPO (dapo.md) |
similar to GRPO example |
similar to GRPO example |
On-policy distillation is also supported in the PyTorch DTensor path. |