Qwen3-Next
Qwen3-Next
Qwen3-Next is an advanced MoE language model from Alibaba Cloud’s Qwen team designed for high-throughput inference with large total parameter counts and efficient per-token activation.
Available Models
- Qwen3-Next-80B-A3B: 80B total parameters, 3B activated per token
Architecture
Qwen3NextForCausalLM
Example HF Models
Example Recipes
Try with NeMo AutoModel
1. Install (full instructions):
2. Clone the repo to get the example recipes:
This recipe was validated on 4 nodes × 8 GPUs (32 H100s). See the Launcher Guide for multi-node setup.
3. Run the recipe from inside the repo:
Run with Docker
1. Pull the container and mount a checkpoint directory:
2. Navigate to the AutoModel directory (where the recipes are):
3. Run the recipe:
See the Installation Guide and LLM Fine-Tuning Guide.
Fine-Tuning
See the Large MoE Fine-Tuning Guide.