Qwen3.5-VL
Qwen3.5-VL
Qwen3.5-VL is Alibaba Cloud’s next-generation vision language model series, including dense and MoE variants for image and multimodal understanding tasks.
Available Models
- Qwen3.5-VL-4B: 4B dense model
- Qwen3.5-VL-9B: 9B dense model
- Qwen3.5-MoE: large MoE variant (35B+)
- Qwen3.6-27B: 27B dense model
- Qwen3.6-35B-A3B: next-generation MoE variant (35B total, 3B active)
Architectures
Qwen3_5VLForConditionalGeneration— dense modelsQwen3_5MoeVLForConditionalGeneration— MoE variant
Example Recipes
Try with NeMo AutoModel
1. Install (full instructions):
2. Clone the repo to get the example recipes:
3. Run the recipe from inside the repo:
Run with Docker
1. Pull the container and mount a checkpoint directory:
2. Navigate to the AutoModel directory (where the recipes are):
3. Run the recipe:
See the Installation Guide and VLM Fine-Tuning Guide.
Fine-Tuning
See the VLM Fine-Tuning Guide.