Qwen Models

This page provides detailed technical specifications for the Qwen model family supported by NeMo Customizer. For information about supported features and capabilities, refer to Tested Models.

Qwen2.5-1.5B-Instruct

Property	Value
Creator	Alibaba Cloud
Architecture	transformer
Description	Qwen2.5-1.5B-Instruct is a compact, instruction-tuned model from the Qwen2.5 series designed for efficient customization and deployment.
Max I/O Tokens	4096
Parameters	1.5 billion
Training Data	Not specified
Default Name	Qwen/Qwen2.5-1.5B-Instruct
Hugging Face	Qwen/Qwen2.5-1.5B-Instruct

Training Options

LoRA: 1x 80GB GPU, tensor parallel size 1
Full SFT: 1x 80GB GPU, tensor parallel size 1

Deployment Configuration

LoRA:
NIM Image: nvcr.io/nim/nvidia/llm-nim:1.15.5
GPU Count: 1x 80GB
Full SFT:
NIM Image: nvcr.io/nim/nvidia/llm-nim:1.15.5
GPU Count: 1x 80GB
Additional Environment Variables:
NIM_MODEL_PROFILE: vllm

Qwen3-0.6B

Property	Value
Creator	Alibaba Cloud
Architecture	transformer
Description	Qwen3-0.6B is a lightweight model from the Qwen3 series, suitable for resource-constrained environments and rapid experimentation.
Max I/O Tokens	4096
Parameters	0.6 billion
Training Data	Not specified
Default Name	Qwen/Qwen3-0.6B
Hugging Face	Qwen/Qwen3-0.6B

Training Options

LoRA: 1x 80GB GPU, tensor parallel size 1
Full SFT: 1x 80GB GPU, tensor parallel size 1

Deployment Configuration

LoRA:
NIM Image: nvcr.io/nim/nvidia/llm-nim:1.15.5
GPU Count: 1x 80GB
Full SFT:
NIM Image: nvcr.io/nim/nvidia/llm-nim:1.15.5
GPU Count: 1x 80GB
Additional Environment Variables:
NIM_MODEL_PROFILE: vllm