Llama Models#

This page provides detailed technical specifications for the Llama model family supported by NeMo Customizer. For information about supported features and capabilities, refer to Tested Models.

Llama-3.2-3B Instruct#

Property

Value

Creator

Meta

Architecture

transformer

Description

Llama-3.2-3B is a compact yet powerful language model suitable for various dialogue applications.

Max I/O Tokens

8192

Parameters

3 billion

Training Data

15+ trillion tokens (up to 2024)

Default Name

meta-llama/Llama-3.2-3B-Instruct

HuggingFace

meta-llama/Llama-3.2-3B-Instruct

Training Options#

  • LoRA: 1x 80GB GPU, tensor parallel size 1

  • Full SFT: 4x 80GB GPU, tensor parallel size 2

Deployment Configuration#

  • LoRA:

    • NIM Image: nvcr.io/nim/nvidia/llm-nim:1.15.5

    • GPU Count: 1x 80GB

  • Full SFT:

    • NIM Image: nvcr.io/nim/nvidia/llm-nim:1.15.5

    • GPU Count: 1x 80GB

    • Additional Environment Variables:

      • NIM_MODEL_PROFILE: vllm

Llama-3.2-1B Instruct#

Property

Value

Creator

Meta

Architecture

transformer

Description

Llama-3.2-1B is a lightweight language model designed for efficient deployment while maintaining strong capabilities.

Max I/O Tokens

8192

Parameters

1 billion

Training Data

15+ trillion tokens (up to 2024)

Default Name

meta-llama/Llama-3.2-1B-Instruct

HuggingFace

meta-llama/Llama-3.2-1B-Instruct

Training Options#

  • LoRA: 1x 80GB GPU, tensor parallel size 1

  • Full SFT: 1x 80GB GPU, tensor parallel size 1

Deployment Configuration#

  • LoRA:

    • NIM Image: nvcr.io/nim/nvidia/llm-nim:1.15.5

    • GPU Count: 1x 80GB

  • Full SFT:

    • NIM Image: nvcr.io/nim/nvidia/llm-nim:1.15.5

    • GPU Count: 1x 80GB

    • Additional Environment Variables:

      • NIM_MODEL_PROFILE: vllm

Llama-3.1-8B Instruct#

Property

Value

Creator

Meta

Architecture

transformer

Description

Llama-3.1-8B is a large language AI model optimized for multilingual dialogue uses.

Max I/O Tokens

8192

Parameters

8 billion

Training Data

15 trillion tokens (up to December 2023)

Default Name

meta-llama/Llama-3.1-8B-Instruct

HuggingFace

meta-llama/Llama-3.1-8B-Instruct

Training Options#

  • LoRA: 1x 80GB GPU, tensor parallel size 1

  • Full SFT: 8x 80GB GPU, tensor parallel size 4

Deployment Configuration#

  • LoRA:

    • NIM Image: nvcr.io/nim/nvidia/llm-nim:1.15.5

    • GPU Count: 1x 80GB

  • Full SFT:

    • NIM Image: nvcr.io/nim/nvidia/llm-nim:1.15.5

    • GPU Count: 8x 80GB

    • Additional Environment Variables:

      • NIM_MODEL_PROFILE: vllm