Support Matrix#
This page lists the supported models, their deployment profiles, and the verified hardware SKUs for NIM LLM.
Supported Models and Profiles#
Use the following sections to identify the supported deployment profiles for each model. Profile strings follow a naming convention described in Model Profiles and Selection.
Note
For supported hardware, refer to the Verified GPUs dropdown for each model or the GPU Compatibility section.
gpt-oss-120b#
Latest supported NIM LLM version: 2.0.1
The following table lists the supported profile configurations for openai/gpt-oss-120b:
Precision |
TP1 |
TP2 |
TP4 |
TP8 |
|---|---|---|---|---|
MXFP4 |
|
|
|
|
MXFP4 + LoRA |
|
|
|
|
Verified GPUs
This model has been verified on the following GPUs:
NVIDIA-A100-SXM4-40GBNVIDIA-A100-SXM4-80GBNVIDIA-B200NVIDIA-B300-SXM6-ACNVIDIA-GB200NVIDIA-GH200-144G-HBM3eNVIDIA-H100-80GB-HBM3NVIDIA-H100-NVLNVIDIA-H200NVIDIA-H200-NVLNVIDIA-L40SNVIDIA-RTX-PRO-6000-Blackwell-Server-Edition
gpt-oss-20b#
Latest supported NIM LLM version: 2.0.1
The following table lists the supported profile configurations for openai/gpt-oss-20b:
Precision |
TP1 |
TP2 |
TP4 |
TP8 |
|---|---|---|---|---|
MXFP4 |
|
|
|
|
MXFP4 + LoRA |
|
|
|
|
Verified GPUs
This model has been verified on the following GPUs:
NVIDIA-A100-SXM4-40GBNVIDIA-A100-SXM4-80GBNVIDIA-A10GNVIDIA-B200NVIDIA-B300-SXM6-ACNVIDIA-GB10NVIDIA-GB200NVIDIA-GH200-144G-HBM3eNVIDIA-GH200-480GBNVIDIA-H100-80GB-HBM3NVIDIA-H200NVIDIA-L40SNVIDIA-RTX-PRO-4500-Blackwell-Server-EditionNVIDIA-RTX-PRO-6000-Blackwell-Server-Edition
llama-3.1-70b-instruct#
Latest supported NIM LLM version: 2.0.1
The following table lists the supported profile configurations for meta/llama-3.1-70b-instruct:
Precision |
TP1 |
TP2 |
TP4 |
TP8 |
|---|---|---|---|---|
BF16 |
|
|
|
|
BF16 + LoRA |
|
|
|
|
FP8 |
|
|
|
|
FP8 + LoRA |
|
|
|
|
NVFP4 |
|
|
|
|
NVFP4 + LoRA |
|
|
|
|
Verified GPUs
This model has been verified on the following GPUs:
NVIDIA-A10GNVIDIA-A100-SXM4-40GBNVIDIA-A100-SXM4-80GBNVIDIA-B200NVIDIA-B300-SXM6-ACNVIDIA-GB200NVIDIA-GH200-144G-HBM3eNVIDIA-GH200-480GBNVIDIA-H100-80GB-HBM3NVIDIA-H100-NVLNVIDIA-H200NVIDIA-H200-NVLNVIDIA-L40SNVIDIA-RTX-PRO-6000-Blackwell-Server-Edition
llama-3.1-8b-instruct#
Latest supported NIM LLM version: 2.0.1
The following table lists the supported profile configurations for meta/llama-3.1-8b-instruct:
Precision |
TP1 |
|---|---|
BF16 |
|
BF16 + LoRA |
|
FP8 |
|
FP8 + LoRA |
|
NVFP4 |
|
NVFP4 + LoRA |
|
Verified GPUs
This model has been verified on the following GPUs:
NVIDIA-A100-SXM4-40GBNVIDIA-A100-SXM4-80GBNVIDIA-B200NVIDIA-B300-SXM6-ACNVIDIA-GB10NVIDIA-GB200NVIDIA-GH200-144G-HBM3eNVIDIA-GH200-480GBNVIDIA-H100-80GB-HBM3NVIDIA-H100-NVLNVIDIA-H200NVIDIA-H200-NVLNVIDIA-L40SNVIDIA-RTX-PRO-4500-Blackwell-Server-EditionNVIDIA-RTX-PRO-6000-Blackwell-Server-Edition
llama-3.3-70b-instruct#
Latest supported NIM LLM version: 2.0.1
The following table lists the supported profile configurations for meta/llama-3.3-70b-instruct:
Precision |
TP1 |
TP2 |
TP4 |
TP8 |
|---|---|---|---|---|
BF16 |
|
|
|
|
BF16 + LoRA |
|
|
|
|
FP8 |
|
|
|
|
FP8 + LoRA |
|
|
|
|
NVFP4 |
|
|
|
|
NVFP4 + LoRA |
– |
|
|
|
Verified GPUs
This model has been verified on the following GPUs:
NVIDIA-A10GNVIDIA-A100-SXM4-40GBNVIDIA-A100-SXM4-80GBNVIDIA-B200NVIDIA-B300-SXM6-ACNVIDIA-GB200NVIDIA-GH200-144G-HBM3eNVIDIA-GH200-480GBNVIDIA-H100-80GB-HBM3NVIDIA-H100-NVLNVIDIA-H200NVIDIA-H200-NVLNVIDIA-L40SNVIDIA-RTX-PRO-6000-Blackwell-Server-Edition
llama-3.3-nemotron-super-49b-v1.5#
Latest supported NIM LLM version: 2.0.1
The following table lists the supported profile configurations for nvidia/llama-3.3-nemotron-super-49b-v1.5:
Precision |
TP1 |
TP2 |
TP4 |
TP8 |
|---|---|---|---|---|
BF16 |
|
|
|
|
BF16 + LoRA |
|
|
|
|
FP8 |
|
|
|
|
FP8 + LoRA |
|
|
|
|
NVFP4 |
|
|
|
|
NVFP4 + LoRA |
|
|
|
|
Verified GPUs
This model has been verified on the following GPUs:
NVIDIA-A100-SXM4-40GBNVIDIA-A100-SXM4-80GBNVIDIA-B200NVIDIA-B300-SXM6-ACNVIDIA-GB10NVIDIA-GB200NVIDIA-GH200-144G-HBM3eNVIDIA-GH200-480GBNVIDIA-H100-80GB-HBM3NVIDIA-H100-NVLNVIDIA-H200NVIDIA-H200-NVLNVIDIA-L40SNVIDIA-RTX-PRO-4500-Blackwell-Server-EditionNVIDIA-RTX-PRO-6000-Blackwell-Server-Edition
nemotron-3-nano#
Latest supported NIM LLM version: 2.0.1
The following table lists the supported profile configurations for nvidia/nemotron-3-nano:
Precision |
TP1 |
TP2 |
TP4 |
TP8 |
|---|---|---|---|---|
BF16 |
|
|
|
|
BF16 + LoRA |
|
|
|
|
FP8 |
|
|
|
|
FP8 + LoRA |
|
|
|
|
NVFP4 |
|
|
|
|
NVFP4 + LoRA |
|
|
|
|
Verified GPUs
This model has been verified on the following GPUs:
NVIDIA-A100-SXM4-40GBNVIDIA-A100-SXM4-80GBNVIDIA-B200NVIDIA-B300-SXM6-ACNVIDIA-GB10NVIDIA-GB200NVIDIA-GH200-144G-HBM3eNVIDIA-GH200-480GBNVIDIA-H100-80GB-HBM3NVIDIA-H100-NVLNVIDIA-H200NVIDIA-H200-NVLNVIDIA-L40SNVIDIA-RTX-PRO-4500-Blackwell-Server-EditionNVIDIA-RTX-PRO-6000-Blackwell-Server-Edition
nemotron-3-super-120b-a12b#
Latest supported NIM LLM version: 2.0.2
Select a verified GPU to view the supported profile configurations for nvidia/nemotron-3-super-120b-a12b:
| Precision | TP1 | TP2 | TP4 | TP8 |
|---|---|---|---|---|
| BF16 | -- | -- | vllm-bf16-tp4-pp1 | vllm-bf16-tp8-pp1 |
| BF16 + LoRA | -- | -- | vllm-bf16-tp4-pp1-lora | vllm-bf16-tp8-pp1-lora |
| FP8 | -- | -- | -- | -- |
| FP8 + LoRA | -- | -- | -- | -- |
| NVFP4 | -- | -- | -- | -- |
| Precision | TP1 | TP2 | TP4 | TP8 |
|---|---|---|---|---|
| BF16 | -- | vllm-bf16-tp2-pp1 | vllm-bf16-tp4-pp1 | vllm-bf16-tp8-pp1 |
| BF16 + LoRA | -- | vllm-bf16-tp2-pp1-lora | vllm-bf16-tp4-pp1-lora | vllm-bf16-tp8-pp1-lora |
| FP8 | vllm-fp8-tp1-pp1 | vllm-fp8-tp2-pp1 | vllm-fp8-tp4-pp1 | vllm-fp8-tp8-pp1 |
| FP8 + LoRA | vllm-fp8-tp1-pp1-lora | vllm-fp8-tp2-pp1-lora | vllm-fp8-tp4-pp1-lora | vllm-fp8-tp8-pp1-lora |
| NVFP4 | vllm-nvfp4-tp1-pp1 | vllm-nvfp4-tp2-pp1 | vllm-nvfp4-tp4-pp1 | vllm-nvfp4-tp8-pp1 |
| Precision | TP1 | TP2 | TP4 | TP8 |
|---|---|---|---|---|
| BF16 | vllm-bf16-tp1-pp1 | vllm-bf16-tp2-pp1 | vllm-bf16-tp4-pp1 | vllm-bf16-tp8-pp1 |
| BF16 + LoRA | vllm-bf16-tp1-pp1-lora | vllm-bf16-tp2-pp1-lora | vllm-bf16-tp4-pp1-lora | vllm-bf16-tp8-pp1-lora |
| FP8 | vllm-fp8-tp1-pp1 | vllm-fp8-tp2-pp1 | vllm-fp8-tp4-pp1 | vllm-fp8-tp8-pp1 |
| FP8 + LoRA | vllm-fp8-tp1-pp1-lora | vllm-fp8-tp2-pp1-lora | vllm-fp8-tp4-pp1-lora | vllm-fp8-tp8-pp1-lora |
| NVFP4 | vllm-nvfp4-tp1-pp1 | vllm-nvfp4-tp2-pp1 | vllm-nvfp4-tp4-pp1 | vllm-nvfp4-tp8-pp1 |
| Precision | TP1 | TP2 | TP4 | TP8 |
|---|---|---|---|---|
| BF16 | -- | vllm-bf16-tp2-pp1 | vllm-bf16-tp4-pp1 | vllm-bf16-tp8-pp1 |
| BF16 + LoRA | -- | vllm-bf16-tp2-pp1-lora | vllm-bf16-tp4-pp1-lora | vllm-bf16-tp8-pp1-lora |
| FP8 | vllm-fp8-tp1-pp1 | vllm-fp8-tp2-pp1 | vllm-fp8-tp4-pp1 | vllm-fp8-tp8-pp1 |
| FP8 + LoRA | vllm-fp8-tp1-pp1-lora | vllm-fp8-tp2-pp1-lora | vllm-fp8-tp4-pp1-lora | vllm-fp8-tp8-pp1-lora |
| NVFP4 | -- | -- | -- | -- |
| Precision | TP1 | TP2 | TP4 | TP8 |
|---|---|---|---|---|
| BF16 | -- | -- | vllm-bf16-tp4-pp1 | vllm-bf16-tp8-pp1 |
| BF16 + LoRA | -- | -- | vllm-bf16-tp4-pp1-lora | vllm-bf16-tp8-pp1-lora |
| FP8 | -- | vllm-fp8-tp2-pp1 | vllm-fp8-tp4-pp1 | vllm-fp8-tp8-pp1 |
| FP8 + LoRA | -- | vllm-fp8-tp2-pp1-lora | vllm-fp8-tp4-pp1-lora | vllm-fp8-tp8-pp1-lora |
| NVFP4 | -- | -- | -- | -- |
| Precision | TP1 | TP2 | TP4 | TP8 |
|---|---|---|---|---|
| BF16 | -- | -- | -- | vllm-bf16-tp8-pp1 |
| BF16 + LoRA | -- | -- | -- | -- |
| FP8 | -- | -- | vllm-fp8-tp4-pp1 | vllm-fp8-tp8-pp1 |
| FP8 + LoRA | -- | -- | -- | -- |
| NVFP4 | -- | -- | -- | -- |
| Precision | TP1 | TP2 | TP4 | TP8 |
|---|---|---|---|---|
| BF16 | -- | -- | -- | -- |
| BF16 + LoRA | -- | -- | -- | -- |
| FP8 | -- | -- | -- | vllm-fp8-tp8-pp1 |
| FP8 + LoRA | -- | -- | -- | vllm-fp8-tp8-pp1-lora |
| NVFP4 | -- | -- | vllm-nvfp4-tp4-pp1 | vllm-nvfp4-tp8-pp1 |
| Precision | TP1 | TP2 | TP4 | TP8 |
|---|---|---|---|---|
| BF16 | -- | -- | -- | vllm-bf16-tp8-pp1 |
| BF16 + LoRA | -- | -- | -- | vllm-bf16-tp8-pp1-lora |
| FP8 | -- | -- | vllm-fp8-tp4-pp1 | vllm-fp8-tp8-pp1 |
| FP8 + LoRA | -- | -- | vllm-fp8-tp4-pp1-lora | vllm-fp8-tp8-pp1-lora |
| NVFP4 | vllm-nvfp4-tp1-pp1 | vllm-nvfp4-tp2-pp1 | vllm-nvfp4-tp4-pp1 | vllm-nvfp4-tp8-pp1 |
Note
This is a large model. Lower-TP profiles require substantially more GPU memory per device, so some verified GPUs support only TP4 or TP8 profiles.
Verified GPUs
The following GPUs have been verified with one or more supported profiles for this model:
NVIDIA-A100-SXM4-80GBNVIDIA-B200NVIDIA-B300-SXM6-ACNVIDIA-GB200NVIDIA-GH200-144G-HBM3eNVIDIA-H100-80GB-HBM3NVIDIA-H100-NVLNVIDIA-H200NVIDIA-H200-NVLNVIDIA-L40SNVIDIA-RTX-PRO-4500-Blackwell-Server-EditionNVIDIA-RTX-PRO-6000-Blackwell-Server-Edition
starcoder2-7b#
Latest supported NIM LLM version: 2.0.1
The following table lists the supported profile configurations for bigcode/starcoder2-7b:
Precision |
TP1 |
TP2 |
|---|---|---|
BF16 |
|
|
Verified GPUs
This model has been verified on the following GPUs:
NVIDIA-H100-80GB-HBM3NVIDIA-H200
Model-Free NIM#
Latest supported NIM LLM version: 2.0.1
The following models are tested and validated for
nvidia/model-free-nim:
gpt-oss-20bapriel-nemotroncodestral
While not explicitly validated, the model-free NIM can be used with any model supported by the underlying backend (vLLM) version. Refer to Model-Free NIM for deployment details.
Verified GPUs
The model-free NIM has been verified on the following GPUs:
NVIDIA-A100-80GB-PCIeNVIDIA-A100-PCIE-40GBNVIDIA-A100-SXM4-40GBNVIDIA-A100-SXM4-80GBNVIDIA-B300-SXM6-ACNVIDIA-GH200-480GBNVIDIA-H100-80GB-HBM3NVIDIA-H100-NVLNVIDIA-H100-PCIeNVIDIA-H200NVIDIA-H200-NVLNVIDIA-RTX-PRO-4500-Blackwell-Server-Edition
GPU Compatibility#
Use the following dropdowns to determine which models are supported on a given GPU:
NVIDIA-A10G
The following models have been verified on this GPU:
NVIDIA-A100-SXM4-40GB
The following models have been verified on this GPU:
NVIDIA-A100-SXM4-80GB
The following models have been verified on this GPU:
NVIDIA-B200
The following models have been verified on this GPU:
NVIDIA-B300-SXM6-AC
The following models have been verified on this GPU:
NVIDIA-GB10
The following models have been verified on this GPU:
NVIDIA-GB200
The following models have been verified on this GPU:
NVIDIA-GH200-144G-HBM3e
The following models have been verified on this GPU:
NVIDIA-GH200-480GB
The following models have been verified on this GPU:
NVIDIA-H100-80GB-HBM3
The following models have been verified on this GPU:
NVIDIA-H100-NVL
The following models have been verified on this GPU:
NVIDIA-H200
The following models have been verified on this GPU:
NVIDIA-H200-NVL
The following models have been verified on this GPU:
NVIDIA-L40S
The following models have been verified on this GPU:
NVIDIA-RTX-PRO-4500-Blackwell-Server-Edition
The following models have been verified on this GPU:
NVIDIA-RTX-PRO-6000-Blackwell-Server-Edition
The following models have been verified on this GPU:
1.x NIM LLM Models#
For more information on version 1.x NIMs, refer to the 1.15 version of the NIM LLM Supported Models page.
Show 1.x models
Model (Hardware Requirements) |
Organization/Model ID (Catalog Page) |
|---|---|
|
|