Llama Models#
This page provides detailed technical specifications for the Llama model family supported by NeMo Customizer. For information about supported features and capabilities, refer to Tested Models.
Llama-3.2-3B Instruct#
Property |
Value |
|---|---|
Creator |
Meta |
Architecture |
transformer |
Description |
Llama-3.2-3B is a compact yet powerful language model suitable for various dialogue applications. |
Max I/O Tokens |
8192 |
Parameters |
3 billion |
Training Data |
15+ trillion tokens (up to 2024) |
Default Name |
meta-llama/Llama-3.2-3B-Instruct |
HuggingFace |
Training Options#
LoRA: 1x 80GB GPU, tensor parallel size 1
Full SFT: 4x 80GB GPU, tensor parallel size 2
Deployment Configuration#
LoRA:
NIM Image:
nvcr.io/nim/nvidia/llm-nim:1.15.5GPU Count: 1x 80GB
Full SFT:
NIM Image:
nvcr.io/nim/nvidia/llm-nim:1.15.5GPU Count: 1x 80GB
Additional Environment Variables:
NIM_MODEL_PROFILE:vllm
Llama-3.2-1B Instruct#
Property |
Value |
|---|---|
Creator |
Meta |
Architecture |
transformer |
Description |
Llama-3.2-1B is a lightweight language model designed for efficient deployment while maintaining strong capabilities. |
Max I/O Tokens |
8192 |
Parameters |
1 billion |
Training Data |
15+ trillion tokens (up to 2024) |
Default Name |
meta-llama/Llama-3.2-1B-Instruct |
HuggingFace |
Training Options#
LoRA: 1x 80GB GPU, tensor parallel size 1
Full SFT: 1x 80GB GPU, tensor parallel size 1
Deployment Configuration#
LoRA:
NIM Image:
nvcr.io/nim/nvidia/llm-nim:1.15.5GPU Count: 1x 80GB
Full SFT:
NIM Image:
nvcr.io/nim/nvidia/llm-nim:1.15.5GPU Count: 1x 80GB
Additional Environment Variables:
NIM_MODEL_PROFILE:vllm
Llama-3.1-8B Instruct#
Property |
Value |
|---|---|
Creator |
Meta |
Architecture |
transformer |
Description |
Llama-3.1-8B is a large language AI model optimized for multilingual dialogue uses. |
Max I/O Tokens |
8192 |
Parameters |
8 billion |
Training Data |
15 trillion tokens (up to December 2023) |
Default Name |
meta-llama/Llama-3.1-8B-Instruct |
HuggingFace |
Training Options#
LoRA: 1x 80GB GPU, tensor parallel size 1
Full SFT: 8x 80GB GPU, tensor parallel size 4
Deployment Configuration#
LoRA:
NIM Image:
nvcr.io/nim/nvidia/llm-nim:1.15.5GPU Count: 1x 80GB
Full SFT:
NIM Image:
nvcr.io/nim/nvidia/llm-nim:1.15.5GPU Count: 8x 80GB
Additional Environment Variables:
NIM_MODEL_PROFILE:vllm