Features and Roadmap#

Available now | Coming in v0.6

Coming in v0.6#

Muon Optimizer - Emerging Optimizer support for SFT/RL
Megatron Inference - Improved performance for Megatron Inference (avoid weight conversion).
SGLang Inference - SGLang rollout support for optimized inference.
Improved Native Performance - Improve training time for native PyTorch models.
Improved Large MoE Performance - Improve Megatron Core training performance and generation performance.
New Models - Qwen3-Next, Nemotron-Super.
Expand Algorithms - GDPO, LoRA support for RL(GRPO) and DPO
Resiliency - Fault tolerance and auto-scaling support
On-Policy Distillation - Multi-teacher and cross tokenizer distillation support
Speculative Decoding - Speculative Decoding support for rollout acceleration

Available Now#

Distributed Training - Ray-based infrastructure.
Environment Support and Isolation - Support for multi-environment training and dependency isolation between components.
Worker Isolation - Process isolation between RL Actors (no worries about global state).
Learning Algorithms - GRPO/GSPO/DAPO, SFT(with LoRA), DPO, and On-policy distillation.
Multi-Turn RL - Multi-turn generation and training for RL with tool use, games, etc.
Advanced Parallelism with DTensor - PyTorch FSDP2, TP, CP, and SP for efficient training (through NeMo AutoModel).
Larger Model Support with Longer Sequences - Performant parallelisms with Megatron Core (TP/PP/CP/SP/EP/FSDP) (through NeMo Megatron Bridge).
Sequence Packing - Sequence packing in both DTensor and Megatron Core for huge training performance gains.
Fast Generation - vLLM backend for optimized inference.
Hugging Face Integration - OOB support in the DTensor path, CKPT conversion available for Megatron path through Megatron Bridge middleware.
End-to-End FP8 Low-Precision Training - Support for Megatron Core FP8 training and FP8 vLLM generation.
Vision Language Models (VLM) - Support SFT and GRPO on VLMs.
Megatron Inference - Megatron Inference for fast Day-0 support for new Megatron models (avoid weight conversion).
Async RL - Support for asynchronous rollouts and replay buffers for off-policy training, and enable a fully asynchronous GRPO.
NeMo-Gym Integration - RL Environment Integration.
GB200 - Container support for GB200.