> For clean Markdown of any page, append .md to the page URL.
> For a complete documentation index, see https://docs.nvidia.com/nemo-platform/llms.txt.
> For AI client integration (Claude Code, Cursor, etc.), connect to the MCP server at https://docs.nvidia.com/nemo-platform/_mcp/server.

# Model Catalog

<a id="ft-model-catalog" />

Explore the model families and sizes supported by NVIDIA NeMo Customizer.

For information on setting up model entities for customization, see the [Manage Model Entities](/documentation/customizer-reference/manage-model-entities/overview) guide.
For fine-tuning and deployment tutorials, see the [Tutorials](/documentation/customizer-reference/tutorials) guide.

## Before You Start

If downloading models hosted on Hugging Face, create a secret with your HuggingFace API key, then create a FileSet and Model Entity referencing the model. See [index](/documentation/customizer-reference/manage-model-entities/overview) for setup instructions.

***

<a id="ft-models" />

## Model Families

View the available Llama models from Meta, ranging from 8 billion to 70 billion parameters.

View the available Llama Nemotron models from NVIDIA, including Nano and Super variants for efficient and advanced instruction tuning.

View the available Phi models from Microsoft, designed for strong reasoning capabilities with efficient deployment.

View the available embedding models optimized for retrieval and question-answering tasks.

View the available GPT-OSS models supported for customization.

View the available Qwen models from Alibaba Cloud, including compact variants for efficient customization.

View the available Mistral models, including Mistral and Ministral variants for instruction-following and reasoning tasks.

## Tested Models

The following table lists models that NVIDIA tested and their available features. This is a list of *known-good* combinations, not a list of limits: NeMo Customizer can fine-tune many models and regimes beyond those listed, including additional Hugging Face checkpoints, other fine-tuning regimes (LoRA, merged-LoRA, full-weight, distillation), and either training backend (Automodel or Unsloth). Models and regimes outside this table may work but have not been formally validated.

For detailed technical specifications of each model such as architecture, parameters, and token limits, refer to the [model family](#model-families) pages.

### Large Language Models

The following models support both chat and completion model training.

| Model                                                                                                                 | Train a Chat Model with Tool Calling | Fine-tuning Options | Sequence Packing[^1] | Inference with NIM        | Reasoning |
| --------------------------------------------------------------------------------------------------------------------- | ------------------------------------ | ------------------- | -------------------- | ------------------------- | --------- |
| [meta-llama/Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct)                           | Yes                                  | Full SFT, LoRA      | Yes                  | Supported                 | No        |
| [meta-llama/Llama-3.2-1B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct)                           | Yes                                  | Full SFT, LoRA      | Yes                  | Supported                 | No        |
| [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct)                           | Yes                                  | Full SFT, LoRA      | Yes                  | Supported                 | No        |
| [nvidia/Llama-3.1-Nemotron-Nano-8B-v1](https://huggingface.co/nvidia/Llama-3.1-Nemotron-Nano-8B-v1)                   | No                                   | Full SFT, LoRA      | Yes                  | Supported                 | Yes       |
| [nvidia/NVIDIA-Nemotron-Nano-9B-v2](https://huggingface.co/nvidia/NVIDIA-Nemotron-Nano-9B-v2)                         | No                                   | Full SFT, LoRA      | No                   | Supported                 | Yes       |
| [nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16](https://huggingface.co/nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16)       | No                                   | Full SFT, LoRA      | No                   | Supported (only Full SFT) | Yes       |
| [nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16](https://huggingface.co/nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16) | No                                   | LoRA                | No                   | Supported                 | Yes       |
| [microsoft/phi-4](https://huggingface.co/microsoft/phi-4)                                                             | No                                   | Full SFT, LoRA      | No                   | Supported                 | No        |
| [openai/gpt-oss-20b](https://huggingface.co/openai/gpt-oss-20b)                                                       | Yes                                  | Full SFT, LoRA      | No                   | Supported                 | Yes       |
| [Qwen/Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct)                                       | No                                   | Full SFT, LoRA      | No                   | Supported                 | Yes       |
| [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B)                                                             | No                                   | Full SFT, LoRA      | No                   | Supported                 | Yes       |
| [mistralai/Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3)                       | No                                   | Full SFT, LoRA      | No                   | Supported                 | No        |
| [mistralai/Ministral-3-3B-Instruct-2512](https://huggingface.co/mistralai/Ministral-3-3B-Instruct-2512)               | No                                   | Full SFT, LoRA      | No                   | No                        | No        |
| [mistralai/Ministral-3-3B-Reasoning-2512](https://huggingface.co/mistralai/Ministral-3-3B-Reasoning-2512)             | No                                   | Full SFT, LoRA      | Yes                  | No                        | Yes       |

[^1]: Read more on [sequence packing with NeMo Framework](https://docs.nvidia.com/nemo-framework/user-guide/latest/sft_peft/packed_sequence.html#)

### Embedding Models

| Model                                                                                         | Fine-tuning Options     | Inference with NIM |
| --------------------------------------------------------------------------------------------- | ----------------------- | ------------------ |
| [nvidia/llama-nemotron-embed-1b-v2](https://huggingface.co/nvidia/llama-nemotron-embed-1b-v2) | Full SFT, LoRA (merged) | Supported          |

For detailed technical specifications and configuration information for embedding models, see the [Embedding Models](/documentation/customizer-reference/models/embedding) page.