Llama Models#

This page provides detailed technical specifications for the Llama model family supported by the NVIDIA NeMo Customizer microservice. For information about supported features and capabilities, refer to the Support Matrix in the Model Catalog.

Llama-3.3-70b Instruct#

Property

Value

Creator

Meta

Architecture

transformer

Description

Llama-3.3-70b is a large language AI model optimized for advanced dialogue and reasoning capabilities.

Max I/O Tokens

8192

Parameters

70 billion

Training Data

15+ trillion tokens (up to 2024)

Recommended GPUs for Customization

16

Default Name

meta/llama-3.3-70b-instruct

Version

nvidia/nemo/llama-3_3-70b-instruct-nemo:1.0

Llama-3.2-3b Instruct#

Property

Value

Creator

Meta

Architecture

transformer

Description

Llama-3.2-3b is a compact yet powerful language model suitable for various dialogue applications.

Max I/O Tokens

8192

Parameters

3 billion

Training Data

15+ trillion tokens (up to 2024)

Recommended GPUs for Customization

1

Default Name

meta/llama-3.2-3b-instruct

Version

nvidia/nemo/llama-3_2-3b-instruct-nemo:1.0

Llama-3.2-1b Instruct#

Property

Value

Creator

Meta

Architecture

transformer

Description

Llama-3.2-1b is a lightweight language model designed for efficient deployment while maintaining strong capabilities.

Max I/O Tokens

8192

Parameters

1 billion

Training Data

15+ trillion tokens (up to 2024)

Recommended GPUs for Customization

1

Default Name

meta/llama-3.2-1b-instruct

Version

nvidia/nemo/llama-3_2-1b-instruct-nemo:1.0

Llama-3.1-70b Instruct#

Property

Value

Creator

Meta

Architecture

transformer

Description

Llama-3.1-70b is a large language AI model optimized for multilingual dialogue uses.

Max I/O Tokens

8192

Parameters

70 billion

Training Data

15 trillion tokens (up to December 2023)

Recommended GPUs for Customization

16

Default Name

meta/llama-3.1-70b-instruct

Version

nvidia/nemo/llama-3_1-70b-instruct-nemo:1.0

Llama-3.1-8b Instruct#

Property

Value

Creator

Meta

Architecture

transformer

Description

Llama-3.1-8b is a large language AI model optimized for multilingual dialogue uses.

Max I/O Tokens

8192

Parameters

8 billion

Training Data

15 trillion tokens (up to December 2023)

Recommended GPUs for Customization

4

Default Name

meta/llama-3.1-8b-instruct

Version

nvidia/nemo/llama-3_1-8b-instruct-nemo:1.0

Llama-3-70b Instruct#

Property

Value

Creator

Meta

Architecture

transformer

Description

Llama-3-70b is a large language AI model comprising a collection of models capable of generating text and code in response to prompts.

Max I/O Tokens

8192

Parameters

70 billion

Training Data

15 trillion tokens (up to December 2023)

Recommended GPUs for Customization

16

Default Name

meta/llama-3-70b-instruct

Version

nvidia/nemo/llama-3-70b:1.0