Model Catalog#

Explore the model families and sizes supported by NVIDIA NeMo Customizer.

Tip

For specific values required to create customization targets, refer to the customization target value reference guide.

Before You Start#

If downloading models hosted on Hugging Face, ensure that hfTargetDownload is enabled in the Helm configuration and that a Hugging Face API key secret is available. Refer to the Hugging Face API Key Secret Guide for setup instructions.


Model Families#

Llama Models

View the available Llama models from Meta, ranging from 8 billion to 70 billion parameters.

Llama Models
Llama Nemotron Models

View the available Llama Nemotron models from NVIDIA, including Nano and Super variants for efficient and advanced instruction tuning.

Llama Nemotron Models
Phi Models

View the available Phi models from Microsoft, designed for strong reasoning capabilities with efficient deployment.

Phi Models
Embedding Models

View the available embedding models optimized for retrieval and question-answering tasks.

Embedding Models
GPT-OSS Models

View the available GPT-OSS models supported for customization.

GPT-OSS Models
Gemma Models

View the available Gemma models from Google, such as the 2B instruction-tuned variant.

Gemma Models

Tested Models#

The following table lists models that NVIDIA tested and their available features. While NeMo Customizer works with all LLM NIM microservices, the table lists the models that NVIDIA tested. Models available for fine-tuning with NeMo Customizer are not limited to those listed.

For detailed technical specifications of each model such as architecture, parameters, and token limits, refer to the model family pages.

Large Language Models#

The following models support both chat and completion model training.

Model

Train a Chat Model with Tool Calling

Fine-tuning Options

Sequence Packing[1]

Inference with NIM

Reasoning

meta/llama-3.3-70b-instruct

Yes

LoRA

Yes

Supported (unverified)

No

meta/llama-3.2-3b-instruct

Yes

All Weights, LoRA

Yes

Supported (unverified)

No

meta/llama-3.2-1b-instruct

Yes

All Weights, LoRA

Yes

Supported

No

meta/llama-3.1-70b-instruct

Yes

LoRA

Yes

Supported (unverified)

No

meta/llama-3.1-8b-instruct

Yes

All Weights, LoRA

Yes

Supported

No

meta/llama3-70b-instruct

Yes

LoRA

Yes

Supported (unverified)

No

nvidia/nemotron-nano-llama-3.1-8b@1.0

No

All Weights, LoRA

Yes

Supported

No

nvidia/nemotron-super-llama-3.3-49b@1.0

No

LoRA

Yes

Supported

No

microsoft/phi-4

No

All Weights, LoRA

No

Not supported

No

google/gemma-2b-it

Yes

All Weights, LoRA

No

Not specified

No

openai/gpt-oss-20b

No

All Weights

No

Supported

No

openai/gpt-oss-120b

No

All Weights

No

Supported

No

Embedding Models#

Model

Fine-tuning Options

Inference with NIM

nvidia/llama-3.2-nv-embedqa-1b-v2

All Weights

Supported

For detailed technical specifications and configuration information for embedding models, see the Embedding Models page.