Qwen2
Qwen2 is Alibaba Cloud’s second-generation large language model series. It features grouped query attention, YARN-based long-context extension, and dual chunk attention for long sequences. QwQ-32B-Preview, a reasoning-focused model, also uses this architecture.
Available Models
- Qwen2.5: 0.5B, 1.5B, 3B, 7B, 14B, 32B, 72B
- Qwen2: 0.5B, 1.5B, 7B, 57B-A14B (MoE), 72B
- QwQ-32B-Preview — reasoning model
Architecture
Qwen2ForCausalLM
Example HF Models
Example Recipes
Try with NeMo AutoModel
1. Install (full instructions):
2. Clone the repo to get the example recipes:
3. Run the recipe from inside the repo:
Run with Docker
1. Pull the container and mount a checkpoint directory:
2. Navigate to the AutoModel directory (where the recipes are):
3. Run the recipe:
See the Installation Guide and LLM Fine-Tuning Guide.
Fine-Tuning
See the LLM Fine-Tuning Guide for full SFT and LoRA instructions.