Model Catalog | NVIDIA NeMo Platform

Explore the model families and sizes supported by NVIDIA NeMo Customizer.

For information on setting up model entities for customization, see the Manage Model Entities guide. For fine-tuning and deployment tutorials, see the Tutorials guide.

Before You Start

If downloading models hosted on Hugging Face, create a secret with your Hugging Face API key, then create a FileSet and Model Entity referencing the model. See index for setup instructions.

Model Families

Llama Models

View the available Llama models from Meta, ranging from 8 billion to 70 billion parameters.

Llama Nemotron Models

View the available Llama Nemotron models from NVIDIA, including Nano and Super variants for efficient and advanced instruction tuning.

Phi Models

View the available Phi models from Microsoft, designed for strong reasoning capabilities with efficient deployment.

Embedding Models

View the available embedding models optimized for retrieval and question-answering tasks.

GPT-OSS Models

View the available GPT-OSS models supported for customization.

Qwen Models

View the available Qwen models from Alibaba Cloud, including compact variants for efficient customization.

Mistral Models

View the available Mistral models, including Mistral and Ministral variants for instruction-following and reasoning tasks.

Tested Models

The following table lists models that NVIDIA tested and their available features. This is a list of known-good combinations, not a list of limits: NeMo Customizer can fine-tune additional Hugging Face checkpoints and regimes when their architectures are supported by the selected training backend. Compatibility varies by architecture and fine-tuning method—for example, Automodel LoRA does not support Conv1D-based models. Models and regimes outside this table may work but have not been formally validated; test them before relying on them in production.

For detailed technical specifications of each model such as architecture, parameters, and token limits, refer to the model family pages.

Large Language Models

The following models support both chat and completion model training.

Model	Train a Chat Model with Tool Calling	Fine-tuning Options	Sequence Packing¹	Inference with NIM	Reasoning
meta-llama/Llama-3.2-3B-Instruct	Yes	Full SFT, LoRA	Yes	Supported	No
meta-llama/Llama-3.2-1B-Instruct	Yes	Full SFT, LoRA	Yes	Supported	No
meta-llama/Llama-3.1-8B-Instruct	Yes	Full SFT, LoRA	Yes	Supported	No
nvidia/Llama-3.1-Nemotron-Nano-8B-v1	No	Full SFT, LoRA	Yes	Supported	Yes
nvidia/NVIDIA-Nemotron-Nano-9B-v2	No	Full SFT, LoRA	No	Supported	Yes
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16	No	Full SFT, LoRA	No	Supported (only Full SFT)	Yes
nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16	No	LoRA	No	Supported	Yes
microsoft/phi-4	No	Full SFT, LoRA	No	Supported	No
openai/gpt-oss-20b	Yes	Full SFT, LoRA	LoRA with prompt-completion data	Supported	Yes
Qwen/Qwen2.5-1.5B-Instruct	No	Full SFT, LoRA	No	Supported	Yes
Qwen/Qwen3-0.6B	No	Full SFT, LoRA	No	Supported	Yes
mistralai/Mistral-7B-Instruct-v0.3	No	Full SFT, LoRA	No	Supported	No
mistralai/Ministral-3-3B-Instruct-2512	No	Full SFT, LoRA	No	No	No
mistralai/Ministral-3-3B-Reasoning-2512	No	Full SFT, LoRA	Yes	No	Yes

Embedding Models

Model	Fine-tuning Options	Inference with NIM
nvidia/llama-nemotron-embed-1b-v2	Full SFT, LoRA (merged)	Supported

For detailed technical specifications and configuration information for embedding models, see the Embedding Models page.

Read more on sequence packing with NeMo Framework ↩

Before You Start

Model Families

Tested Models

Large Language Models

Embedding Models

Footnotes