Nemotron 3 Ultra#

Nemotron 3 Ultra is a 550B total / A55B active hybrid Mamba-Transformer MoE model.

Megatron Bridge provides Nemotron 3 Ultra recipes and examples for Hugging Face to Megatron conversion, inference, DCLM pretraining, packed OpenMathInstruct-2 full SFT, and packed OpenMathInstruct-2 LoRA PEFT.

Use the main example README for setup and scripts: examples/models/nemotron/nemotron_3/ultra/README.md.