Llama Models

View as Markdown

This page provides detailed technical specifications for the Llama model family supported by NeMo Customizer. For information about supported features and capabilities, refer to Tested Models.

Llama-3.2-3B Instruct

PropertyValue
CreatorMeta
Architecturetransformer
DescriptionLlama-3.2-3B is a compact yet powerful language model suitable for various dialogue applications.
Max I/O Tokens8192
Parameters3 billion
Training Data15+ trillion tokens (up to 2024)
Default Namemeta-llama/Llama-3.2-3B-Instruct
HuggingFacemeta-llama/Llama-3.2-3B-Instruct

Training Options

  • LoRA: 1x 80GB GPU, tensor parallel size 1
  • Full SFT: 4x 80GB GPU, tensor parallel size 2

Deployment Configuration

  • LoRA:
  • NIM Image: nvcr.io/nim/nvidia/llm-nim:1.15.5
  • GPU Count: 1x 80GB
  • Full SFT:
  • NIM Image: nvcr.io/nim/nvidia/llm-nim:1.15.5
  • GPU Count: 1x 80GB
  • Additional Environment Variables:
  • NIM_MODEL_PROFILE: vllm

Llama-3.2-1B Instruct

PropertyValue
CreatorMeta
Architecturetransformer
DescriptionLlama-3.2-1B is a lightweight language model designed for efficient deployment while maintaining strong capabilities.
Max I/O Tokens8192
Parameters1 billion
Training Data15+ trillion tokens (up to 2024)
Default Namemeta-llama/Llama-3.2-1B-Instruct
HuggingFacemeta-llama/Llama-3.2-1B-Instruct

Training Options

  • LoRA: 1x 80GB GPU, tensor parallel size 1
  • Full SFT: 1x 80GB GPU, tensor parallel size 1

Deployment Configuration

  • LoRA:
  • NIM Image: nvcr.io/nim/nvidia/llm-nim:1.15.5
  • GPU Count: 1x 80GB
  • Full SFT:
  • NIM Image: nvcr.io/nim/nvidia/llm-nim:1.15.5
  • GPU Count: 1x 80GB
  • Additional Environment Variables:
  • NIM_MODEL_PROFILE: vllm

Llama-3.1-8B Instruct

PropertyValue
CreatorMeta
Architecturetransformer
DescriptionLlama-3.1-8B is a large language AI model optimized for multilingual dialogue uses.
Max I/O Tokens8192
Parameters8 billion
Training Data15 trillion tokens (up to December 2023)
Default Namemeta-llama/Llama-3.1-8B-Instruct
HuggingFacemeta-llama/Llama-3.1-8B-Instruct

Training Options

  • LoRA: 1x 80GB GPU, tensor parallel size 1
  • Full SFT: 8x 80GB GPU, tensor parallel size 4

Deployment Configuration

  • LoRA:
  • NIM Image: nvcr.io/nim/nvidia/llm-nim:1.15.5
  • GPU Count: 1x 80GB
  • Full SFT:
  • NIM Image: nvcr.io/nim/nvidia/llm-nim:1.15.5
  • GPU Count: 8x 80GB
  • Additional Environment Variables:
  • NIM_MODEL_PROFILE: vllm