Support Matrix for NIM Day 0#

This page lists the supported models and GPUs for NIM Day 0.

nemotron-content-safety-reasoning-4b-experimental#

Latest supported NIM LLM version: 2.0.3

For information on getting started with nemotron-content-safety-reasoning-4b-experimental, refer to version 2.0.3 of the Day 0 documentation.

Latest supported NIM LLM version: 2.0.5-variant

The following table lists the supported configurations for nvidia/nemotron-3-ultra-550b-a55b:

The GPU Memory values are per GPU and in GB.

GPU	GPU Memory	Precision	TP	PP	Number of GPUs	LoRA Profile
NVIDIA-H100-80GB-HBM3	80	BF16	8	2	16 (multi-node)	Base and LoRA
NVIDIA-H100-80GB-HBM3	80	NVFP4	8	1	8	Base only
NVIDIA-H200	141	BF16	8	1	8	Base only
NVIDIA-H200	141	NVFP4	4	1	4	Base only
NVIDIA-H200	141	NVFP4	8	1	8	Base only
NVIDIA-B200	180	BF16	8	1	8	Base and LoRA
NVIDIA-B200	180	NVFP4	2	1	2	Base only
NVIDIA-B200	180	NVFP4	4	1	4	Base only
NVIDIA-B200	180	NVFP4	8	1	8	Base only
NVIDIA-B300-SXM6-AC	288	BF16	4	1	4	Base only
NVIDIA-B300-SXM6-AC	288	BF16	8	1	8	Base and LoRA
NVIDIA-B300-SXM6-AC	288	NVFP4	2	1	2	Base only
NVIDIA-B300-SXM6-AC	288	NVFP4	4	1	4	Base only
NVIDIA-B300-SXM6-AC	288	NVFP4	8	1	8	Base only
NVIDIA-GB200	192	BF16	4	2	8 (multi-node)	Base and LoRA
NVIDIA-GB200	192	NVFP4	2	1	2	Base only
NVIDIA-GB200	192	NVFP4	4	1	4	Base only
NVIDIA-GB300	288	BF16	4	1	4	Base only
NVIDIA-GB300	288	NVFP4	2	1	2	Base only
NVIDIA-GB300	288	NVFP4	4	1	4	Base only
Any	>= 270	BF16	4	1	4	Base only
Any	>= 139	BF16	8	1	8	Base and LoRA
Any	>= 172	NVFP4	2	1	2	Base only
Any	>= 90	NVFP4	4	1	4	Base only
Any	>= 49	NVFP4	8	1	8	Base only

LoRA profiles are available only for rows marked Base and LoRA.

GPU-specific profiles use the following tensor-parallel (TP) and pipeline-parallel (PP) values:

NVIDIA-H100-80GB-HBM3: BF16 TP=8, PP=2 across two nodes (16 GPUs total). Requires inter-node InfiniBand or RoCE networking. NVFP4 TP=8, PP=1 on a single node (8 GPUs).
NVIDIA-H200: BF16 TP=8, PP=1 on a single node (8 GPUs); NVFP4 TP=4 or TP=8, PP=1 on a single node.
NVIDIA-B200: BF16 TP=8, PP=1 on a single node (8 GPUs); NVFP4 TP=2, TP=4, or TP=8 on a single node.
NVIDIA-B300-SXM6-AC: BF16 TP=4 or TP=8 on a single node; NVFP4 TP=2, TP=4, or TP=8 on a single node.
NVIDIA-GB200: BF16 TP=4, PP=2 across two NVL trays of four GPUs. Requires inter-node InfiniBand or RoCE networking. NVFP4 TP=2 or TP=4 is supported on a single node.
NVIDIA-GB300: BF16 TP=4 on a single node; NVFP4 TP=2 or TP=4 on a single node.

The Any rows are GPU-memory-gated profiles that are available when each GPU satisfies the listed min_vram_per_device_gb gate:

For more information about deploying NIM across multiple nodes, refer to Multi-Node Deployment.