Model Catalog#

Explore the model families and sizes supported by NVIDIA NeMo Customizer.

Tip

For information on setting up model entities for customization, see the Manage Model Entities guide. For fine-tuning and deployment tutorials, see the Tutorials guide.

Before You Start#

If downloading models hosted on Hugging Face, create a secret with your HuggingFace API key, then create a FileSet and Model Entity referencing the model. See Manage Model Entities for Customization for setup instructions.

Model Families#

Llama Models

View the available Llama models from Meta, ranging from 8 billion to 70 billion parameters.

Llama Models

Llama Nemotron Models

View the available Llama Nemotron models from NVIDIA, including Nano and Super variants for efficient and advanced instruction tuning.

Llama Nemotron Models

Phi Models

View the available Phi models from Microsoft, designed for strong reasoning capabilities with efficient deployment.

Phi Models

Embedding Models

View the available embedding models optimized for retrieval and question-answering tasks.

Embedding Models

GPT-OSS Models

View the available GPT-OSS models supported for customization.

GPT-OSS Models

Qwen Models

View the available Qwen models from Alibaba Cloud, including compact variants for efficient customization.

Qwen Models

Mistral Models

View the available Mistral models, including Mistral and Ministral variants for instruction-following and reasoning tasks.

Mistral Models

Tested Models#

The following table lists models that NVIDIA tested and their available features. While NeMo Customizer works with all LLM NIM microservices, the table lists the models that NVIDIA tested. Models available for fine-tuning with NeMo Customizer are not limited to those listed.

For detailed technical specifications of each model such as architecture, parameters, and token limits, refer to the model family pages.

Large Language Models#

The following models support both chat and completion model training.

Model	Train a Chat Model with Tool Calling	Fine-tuning Options	Sequence Packing[1]	Inference with NIM	Reasoning
meta-llama/Llama-3.2-3B-Instruct	Yes	Full SFT, LoRA	Yes	Supported	No
meta-llama/Llama-3.2-1B-Instruct	Yes	Full SFT, LoRA	Yes	Supported	No
meta-llama/Llama-3.1-8B-Instruct	Yes	Full SFT, LoRA	Yes	Supported	No
nvidia/Llama-3.1-Nemotron-Nano-8B-v1	No	Full SFT, LoRA	Yes	Supported	Yes
nvidia/NVIDIA-Nemotron-Nano-9B-v2	No	Full SFT, LoRA	No	Supported	Yes
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16	No	Full SFT, LoRA	No	Supported (only Full SFT)	Yes
nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16	No	LoRA	No	Supported	Yes
microsoft/phi-4	No	Full SFT, LoRA	No	Supported	No
openai/gpt-oss-20b	Yes	Full SFT, LoRA	No	Supported	Yes
Qwen/Qwen2.5-1.5B-Instruct	No	Full SFT, LoRA	No	Supported	Yes
Qwen/Qwen3-0.6B	No	Full SFT, LoRA	No	Supported	Yes
mistralai/Mistral-7B-Instruct-v0.3	No	Full SFT, LoRA	No	Supported	No
mistralai/Ministral-3-3B-Instruct-2512	No	Full SFT, LoRA	No	No	No
mistralai/Ministral-3-3B-Reasoning-2512	No	Full SFT, LoRA	Yes	No	Yes

Embedding Models#

Model	Fine-tuning Options	Inference with NIM
nvidia/llama-nemotron-embed-1b-v2	Full SFT, LoRA (merged)	Supported

For detailed technical specifications and configuration information for embedding models, see the Embedding Models page.