Llama Models
This page provides detailed technical specifications for the Llama model family supported by NeMo Customizer. For information about supported features and capabilities, refer to Tested Models.
Llama-3.2-3B Instruct
Training Options
- LoRA: 1x 80GB GPU, tensor parallel size 1
- Full SFT: 4x 80GB GPU, tensor parallel size 2
Deployment Configuration
- LoRA:
- NIM Image:
nvcr.io/nim/nvidia/llm-nim:1.15.5 - GPU Count: 1x 80GB
- Full SFT:
- NIM Image:
nvcr.io/nim/nvidia/llm-nim:1.15.5 - GPU Count: 1x 80GB
- Additional Environment Variables:
NIM_MODEL_PROFILE:vllm
Llama-3.2-1B Instruct
Training Options
- LoRA: 1x 80GB GPU, tensor parallel size 1
- Full SFT: 1x 80GB GPU, tensor parallel size 1
Deployment Configuration
- LoRA:
- NIM Image:
nvcr.io/nim/nvidia/llm-nim:1.15.5 - GPU Count: 1x 80GB
- Full SFT:
- NIM Image:
nvcr.io/nim/nvidia/llm-nim:1.15.5 - GPU Count: 1x 80GB
- Additional Environment Variables:
NIM_MODEL_PROFILE:vllm
Llama-3.1-8B Instruct
Training Options
- LoRA: 1x 80GB GPU, tensor parallel size 1
- Full SFT: 8x 80GB GPU, tensor parallel size 4
Deployment Configuration
- LoRA:
- NIM Image:
nvcr.io/nim/nvidia/llm-nim:1.15.5 - GPU Count: 1x 80GB
- Full SFT:
- NIM Image:
nvcr.io/nim/nvidia/llm-nim:1.15.5 - GPU Count: 8x 80GB
- Additional Environment Variables:
NIM_MODEL_PROFILE:vllm