Qwen3
Qwen3 is Alibaba Cloud’s third-generation dense language model series, featuring improved reasoning, instruction following, and multilingual capabilities over Qwen2.
Available Models
- Qwen3: 0.6B, 1.7B, 4B, 8B, 14B, 32B
Architecture
Qwen3ForCausalLM
Example HF Models
Example Recipes
Try with NeMo AutoModel
1. Install (full instructions):
2. Clone the repo to get the example recipes:
3. Run the recipe from inside the repo:
Run with Docker
1. Pull the container and mount a checkpoint directory:
2. Navigate to the AutoModel directory (where the recipes are):
3. Run the recipe:
See the Installation Guide and LLM Fine-Tuning Guide.
Fine-Tuning
See the LLM Fine-Tuning Guide for full SFT and LoRA instructions.