MiniMax-M2#

MiniMax-M2 is MiniMax’s large Mixture-of-Experts language model with linear attention for efficient long-context inference.

Available Models#

Recipe	Description
`minimax_m2.1_hellaswag_pp.yaml`	SFT — MiniMax-M2.1 on HellaSwag with pipeline parallelism
`minimax_m2.5_hellaswag_pp.yaml`	SFT — MiniMax-M2.5 on HellaSwag with pipeline parallelism
`minimax_m2.7_hellaswag_pp.yaml`	SFT — MiniMax-M2.7 on HellaSwag with pipeline parallelism

1. Install (full instructions):

pip install nemo-automodel

2. Clone the repo to get the example recipes:

git clone https://github.com/NVIDIA-NeMo/Automodel.git
cd Automodel

Note

This recipe was validated on 8 nodes × 8 GPUs (64 H100s). See the Launcher Guide for multi-node setup.

3. Run the recipe from inside the repo:

automodel --nproc-per-node=8 examples/llm_finetune/minimax_m2/minimax_m2.1_hellaswag_pp.yaml