Qwen2 MoE
Qwen1.5-MoE is a Mixture-of-Experts variant from Alibaba Cloud that activates only a fraction of parameters per token, enabling efficient training and inference at scale.
Available Models
- Qwen1.5-MoE-A2.7B: 14.3B total parameters, 2.7B activated per token
Architecture
Qwen2MoeForCausalLM
Example HF Models
Example Recipes
Try with NeMo AutoModel
1. Install (full instructions):
2. Clone the repo to get the example recipes:
3. Run the recipe from inside the repo:
Run with Docker
1. Pull the container and mount a checkpoint directory:
2. Navigate to the AutoModel directory (where the recipes are):
3. Run the recipe:
See the Installation Guide and LLM Fine-Tuning Guide.
Fine-Tuning
See the LLM Fine-Tuning Guide and the Large MoE Fine-Tuning Guide.