Support Matrix#

This page lists the supported models, their deployment profiles, and the verified hardware SKUs for NIM LLM.

Supported Models and Profiles#

Use the following sections to identify the supported deployment profiles for each model. Profile strings follow a naming convention described in Model Profiles and Selection.

Note

For supported hardware, refer to the Verified GPUs dropdown for each model or the GPU Compatibility section.

gpt-oss-120b#

The following table lists the supported profile configurations for gpt-oss-120b:

Precision

TP1

TP2

TP4

TP8

MXFP4

vllm-mxfp4-tp1-pp1

vllm-mxfp4-tp2-pp1

vllm-mxfp4-tp4-pp1

vllm-mxfp4-tp8-pp1

MXFP4 + LoRA

vllm-mxfp4-tp1-pp1-lora

vllm-mxfp4-tp2-pp1-lora

vllm-mxfp4-tp4-pp1-lora

vllm-mxfp4-tp8-pp1-lora

Verified GPUs

This model has been verified on the following GPUs:

  • NVIDIA-A100-SXM4-40GB

  • NVIDIA-A100-SXM4-80GB

  • NVIDIA-B200

  • NVIDIA-B300-SXM6-AC

  • NVIDIA-GB200

  • NVIDIA-GH200-144G-HBM3e

  • NVIDIA-H100-80GB-HBM3

  • NVIDIA-H100-NVL

  • NVIDIA-H200

  • NVIDIA-H200-NVL

  • NVIDIA-L40S

  • NVIDIA-RTX-PRO-6000-Blackwell-Server-Edition

gpt-oss-20b#

The following table lists the supported profile configurations for gpt-oss-20b:

Precision

TP1

TP2

TP4

TP8

MXFP4

vllm-mxfp4-tp1-pp1

vllm-mxfp4-tp2-pp1

vllm-mxfp4-tp4-pp1

vllm-mxfp4-tp8-pp1

MXFP4 + LoRA

vllm-mxfp4-tp1-pp1-lora

vllm-mxfp4-tp2-pp1-lora

vllm-mxfp4-tp4-pp1-lora

vllm-mxfp4-tp8-pp1-lora

Verified GPUs

This model has been verified on the following GPUs:

  • NVIDIA-A100-SXM4-40GB

  • NVIDIA-A100-SXM4-80GB

  • NVIDIA-A10G

  • NVIDIA-B200

  • NVIDIA-B300-SXM6-AC

  • NVIDIA-GB10

  • NVIDIA-GB200

  • NVIDIA-GH200-144G-HBM3e

  • NVIDIA-GH200-480GB

  • NVIDIA-H100-80GB-HBM3

  • NVIDIA-H200

  • NVIDIA-L40S

  • NVIDIA-RTX-PRO-4500-Blackwell-Server-Edition

  • NVIDIA-RTX-PRO-6000-Blackwell-Server-Edition

llama-3.1-70b-instruct#

The following table lists the supported profile configurations for llama-3.1-70b-instruct:

Precision

TP1

TP2

TP4

TP8

BF16

vllm-bf16-tp1-pp1

vllm-bf16-tp2-pp1

vllm-bf16-tp4-pp1

vllm-bf16-tp8-pp1

BF16 + LoRA

vllm-bf16-tp1-pp1-lora

vllm-bf16-tp2-pp1-lora

vllm-bf16-tp4-pp1-lora

vllm-bf16-tp8-pp1-lora

FP8

vllm-fp8-tp1-pp1

vllm-fp8-tp2-pp1

vllm-fp8-tp4-pp1

vllm-fp8-tp8-pp1

FP8 + LoRA

vllm-fp8-tp1-pp1-lora

vllm-fp8-tp2-pp1-lora

vllm-fp8-tp4-pp1-lora

vllm-fp8-tp8-pp1-lora

NVFP4

vllm-nvfp4-tp1-pp1

vllm-nvfp4-tp2-pp1

vllm-nvfp4-tp4-pp1

vllm-nvfp4-tp8-pp1

NVFP4 + LoRA

vllm-nvfp4-tp1-pp1-lora

vllm-nvfp4-tp2-pp1-lora

vllm-nvfp4-tp4-pp1-lora

vllm-nvfp4-tp8-pp1-lora

Verified GPUs

This model has been verified on the following GPUs:

  • NVIDIA-A10G

  • NVIDIA-A100-SXM4-40GB

  • NVIDIA-A100-SXM4-80GB

  • NVIDIA-B200

  • NVIDIA-B300-SXM6-AC

  • NVIDIA-GB200

  • NVIDIA-GH200-144G-HBM3e

  • NVIDIA-GH200-480GB

  • NVIDIA-H100-80GB-HBM3

  • NVIDIA-H100-NVL

  • NVIDIA-H200

  • NVIDIA-H200-NVL

  • NVIDIA-L40S

  • NVIDIA-RTX-PRO-6000-Blackwell-Server-Edition

llama-3.1-8b-instruct#

The following table lists the supported profile configurations for llama-3.1-8b-instruct:

Precision

TP1

BF16

vllm-bf16-tp1-pp1

BF16 + LoRA

vllm-bf16-tp1-pp1-lora

FP8

vllm-fp8-tp1-pp1

FP8 + LoRA

vllm-fp8-tp1-pp1-lora

NVFP4

vllm-nvfp4-tp1-pp1

NVFP4 + LoRA

vllm-nvfp4-tp1-pp1-lora

Verified GPUs

This model has been verified on the following GPUs:

  • NVIDIA-A100-SXM4-40GB

  • NVIDIA-A100-SXM4-80GB

  • NVIDIA-B200

  • NVIDIA-B300-SXM6-AC

  • NVIDIA-GB10

  • NVIDIA-GB200

  • NVIDIA-GH200-144G-HBM3e

  • NVIDIA-GH200-480GB

  • NVIDIA-H100-80GB-HBM3

  • NVIDIA-H100-NVL

  • NVIDIA-H200

  • NVIDIA-H200-NVL

  • NVIDIA-L40S

  • NVIDIA-RTX-PRO-4500-Blackwell-Server-Edition

  • NVIDIA-RTX-PRO-6000-Blackwell-Server-Edition

llama-3.3-70b-instruct#

The following table lists the supported profile configurations for llama-3.3-70b-instruct:

Precision

TP1

TP2

TP4

TP8

BF16

vllm-bf16-tp1-pp1

vllm-bf16-tp2-pp1

vllm-bf16-tp4-pp1

vllm-bf16-tp8-pp1

BF16 + LoRA

vllm-bf16-tp1-pp1-lora

vllm-bf16-tp2-pp1-lora

vllm-bf16-tp4-pp1-lora

vllm-bf16-tp8-pp1-lora

FP8

vllm-fp8-tp1-pp1

vllm-fp8-tp2-pp1

vllm-fp8-tp4-pp1

vllm-fp8-tp8-pp1

FP8 + LoRA

vllm-fp8-tp1-pp1-lora

vllm-fp8-tp2-pp1-lora

vllm-fp8-tp4-pp1-lora

vllm-fp8-tp8-pp1-lora

NVFP4

vllm-nvfp4-tp1-pp1

vllm-nvfp4-tp2-pp1

vllm-nvfp4-tp4-pp1

vllm-nvfp4-tp8-pp1

NVFP4 + LoRA

vllm-nvfp4-tp2-pp1-lora

vllm-nvfp4-tp4-pp1-lora

vllm-nvfp4-tp8-pp1-lora

Verified GPUs

This model has been verified on the following GPUs:

  • NVIDIA-A10G

  • NVIDIA-A100-SXM4-40GB

  • NVIDIA-A100-SXM4-80GB

  • NVIDIA-B200

  • NVIDIA-B300-SXM6-AC

  • NVIDIA-GB200

  • NVIDIA-GH200-144G-HBM3e

  • NVIDIA-GH200-480GB

  • NVIDIA-H100-80GB-HBM3

  • NVIDIA-H100-NVL

  • NVIDIA-H200

  • NVIDIA-H200-NVL

  • NVIDIA-L40S

  • NVIDIA-RTX-PRO-6000-Blackwell-Server-Edition

llama-3.3-nemotron-super-49b-v1.5#

The following table lists the supported profile configurations for llama-3.3-nemotron-super-49b-v1.5:

Precision

TP1

TP2

TP4

TP8

BF16

vllm-bf16-tp1-pp1

vllm-bf16-tp2-pp1

vllm-bf16-tp4-pp1

vllm-bf16-tp8-pp1

BF16 + LoRA

vllm-bf16-tp1-pp1-lora

vllm-bf16-tp2-pp1-lora

vllm-bf16-tp4-pp1-lora

vllm-bf16-tp8-pp1-lora

FP8

vllm-fp8-tp1-pp1

vllm-fp8-tp2-pp1

vllm-fp8-tp4-pp1

vllm-fp8-tp8-pp1

FP8 + LoRA

vllm-fp8-tp1-pp1-lora

vllm-fp8-tp2-pp1-lora

vllm-fp8-tp4-pp1-lora

vllm-fp8-tp8-pp1-lora

NVFP4

vllm-nvfp4-tp1-pp1

vllm-nvfp4-tp2-pp1

vllm-nvfp4-tp4-pp1

vllm-nvfp4-tp8-pp1

NVFP4 + LoRA

vllm-nvfp4-tp1-pp1-lora

vllm-nvfp4-tp2-pp1-lora

vllm-nvfp4-tp4-pp1-lora

vllm-nvfp4-tp8-pp1-lora

Verified GPUs

This model has been verified on the following GPUs:

  • NVIDIA-A100-SXM4-40GB

  • NVIDIA-A100-SXM4-80GB

  • NVIDIA-B200

  • NVIDIA-B300-SXM6-AC

  • NVIDIA-GB10

  • NVIDIA-GB200

  • NVIDIA-GH200-144G-HBM3e

  • NVIDIA-GH200-480GB

  • NVIDIA-H100-80GB-HBM3

  • NVIDIA-H100-NVL

  • NVIDIA-H200

  • NVIDIA-H200-NVL

  • NVIDIA-L40S

  • NVIDIA-RTX-PRO-4500-Blackwell-Server-Edition

  • NVIDIA-RTX-PRO-6000-Blackwell-Server-Edition

nemotron-3-nano#

The following table lists the supported profile configurations for nemotron-3-nano:

Precision

TP1

TP2

TP4

TP8

BF16

vllm-bf16-tp1-pp1

vllm-bf16-tp2-pp1

vllm-bf16-tp4-pp1

vllm-bf16-tp8-pp1

BF16 + LoRA

vllm-bf16-tp1-pp1-lora

vllm-bf16-tp2-pp1-lora

vllm-bf16-tp4-pp1-lora

vllm-bf16-tp8-pp1-lora

FP8

vllm-fp8-tp1-pp1

vllm-fp8-tp2-pp1

vllm-fp8-tp4-pp1

vllm-fp8-tp8-pp1

FP8 + LoRA

vllm-fp8-tp1-pp1-lora

vllm-fp8-tp2-pp1-lora

vllm-fp8-tp4-pp1-lora

vllm-fp8-tp8-pp1-lora

NVFP4

vllm-nvfp4-tp1-pp1

vllm-nvfp4-tp2-pp1

vllm-nvfp4-tp4-pp1

vllm-nvfp4-tp8-pp1

NVFP4 + LoRA

vllm-nvfp4-tp1-pp1-lora

vllm-nvfp4-tp2-pp1-lora

vllm-nvfp4-tp4-pp1-lora

vllm-nvfp4-tp8-pp1-lora

Verified GPUs

This model has been verified on the following GPUs:

  • NVIDIA-A100-SXM4-40GB

  • NVIDIA-A100-SXM4-80GB

  • NVIDIA-B200

  • NVIDIA-B300-SXM6-AC

  • NVIDIA-GB10

  • NVIDIA-GB200

  • NVIDIA-GH200-144G-HBM3e

  • NVIDIA-GH200-480GB

  • NVIDIA-H100-80GB-HBM3

  • NVIDIA-H100-NVL

  • NVIDIA-H200

  • NVIDIA-H200-NVL

  • NVIDIA-L40S

  • NVIDIA-RTX-PRO-4500-Blackwell-Server-Edition

  • NVIDIA-RTX-PRO-6000-Blackwell-Server-Edition

starcoder2-7b#

The following table lists the supported profile configurations for starcoder2-7b:

Precision

TP1

TP2

BF16

vllm-bf16-tp1-pp1

vllm-bf16-tp2-pp1

Verified GPUs

This model has been verified on the following GPUs:

  • NVIDIA-H100-80GB-HBM3

  • NVIDIA-H200

Model-Free NIM#

The following models are tested and validated for model-free NIM:

  • gpt-oss-20b

  • apriel-nemotron

  • codestral

While not explicitly validated, the model-free NIM can be used with any model supported by the underlying backend (vLLM) version. Refer to Model-Free NIM for deployment details.

Verified GPUs

The model-free NIM has been verified on the following GPUs:

  • NVIDIA-A100-80GB-PCIe

  • NVIDIA-A100-PCIE-40GB

  • NVIDIA-A100-SXM4-40GB

  • NVIDIA-A100-SXM4-80GB

  • NVIDIA-B300-SXM6-AC

  • NVIDIA-GH200-480GB

  • NVIDIA-H100-80GB-HBM3

  • NVIDIA-H100-NVL

  • NVIDIA-H100-PCIe

  • NVIDIA-H200

  • NVIDIA-H200-NVL

  • NVIDIA-RTX-PRO-4500-Blackwell-Server-Edition

GPU Compatibility#

Use the following dropdowns to determine which models are supported on a given GPU:

NVIDIA-A10G

The following models have been verified on this GPU:

NVIDIA-GB10

The following models have been verified on this GPU:

NVIDIA-RTX-PRO-4500-Blackwell-Server-Edition

The following models have been verified on this GPU:

NVIDIA-RTX-PRO-6000-Blackwell-Server-Edition

1.x NIM LLM Models#

For more information on version 1.x NIMs, refer to the 1.15 version of the NIM LLM Supported Models page.

Show 1.x models