Support Matrix for NIM Day 0#
This page lists the supported models and GPUs for NIM Day 0.
nemotron-content-safety-reasoning-4b-experimental#
Latest supported NIM LLM version: 2.0.3
For information on getting started with nemotron-content-safety-reasoning-4b-experimental, refer to version 2.0.3 of the Day 0 documentation.
nemotron-3-ultra-550b-a55b#
Latest supported NIM LLM version: 2.0.5-variant
The following table lists the supported configurations for nvidia/nemotron-3-ultra-550b-a55b:
The GPU Memory values are per GPU and in GB.
GPU |
GPU Memory |
Precision |
TP |
PP |
Number of GPUs |
LoRA Profile |
|---|---|---|---|---|---|---|
NVIDIA-H100-80GB-HBM3 |
80 |
BF16 |
8 |
2 |
16 (multi-node) |
Base and LoRA |
NVIDIA-H100-80GB-HBM3 |
80 |
NVFP4 |
8 |
1 |
8 |
Base only |
NVIDIA-H200 |
141 |
BF16 |
8 |
1 |
8 |
Base only |
NVIDIA-H200 |
141 |
NVFP4 |
4 |
1 |
4 |
Base only |
NVIDIA-H200 |
141 |
NVFP4 |
8 |
1 |
8 |
Base only |
NVIDIA-B200 |
180 |
BF16 |
8 |
1 |
8 |
Base and LoRA |
NVIDIA-B200 |
180 |
NVFP4 |
2 |
1 |
2 |
Base only |
NVIDIA-B200 |
180 |
NVFP4 |
4 |
1 |
4 |
Base only |
NVIDIA-B200 |
180 |
NVFP4 |
8 |
1 |
8 |
Base only |
NVIDIA-B300-SXM6-AC |
288 |
BF16 |
4 |
1 |
4 |
Base only |
NVIDIA-B300-SXM6-AC |
288 |
BF16 |
8 |
1 |
8 |
Base and LoRA |
NVIDIA-B300-SXM6-AC |
288 |
NVFP4 |
2 |
1 |
2 |
Base only |
NVIDIA-B300-SXM6-AC |
288 |
NVFP4 |
4 |
1 |
4 |
Base only |
NVIDIA-B300-SXM6-AC |
288 |
NVFP4 |
8 |
1 |
8 |
Base only |
NVIDIA-GB200 |
192 |
BF16 |
4 |
2 |
8 (multi-node) |
Base and LoRA |
NVIDIA-GB200 |
192 |
NVFP4 |
2 |
1 |
2 |
Base only |
NVIDIA-GB200 |
192 |
NVFP4 |
4 |
1 |
4 |
Base only |
NVIDIA-GB300 |
288 |
BF16 |
4 |
1 |
4 |
Base only |
NVIDIA-GB300 |
288 |
NVFP4 |
2 |
1 |
2 |
Base only |
NVIDIA-GB300 |
288 |
NVFP4 |
4 |
1 |
4 |
Base only |
Any |
>= 270 |
BF16 |
4 |
1 |
4 |
Base only |
Any |
>= 139 |
BF16 |
8 |
1 |
8 |
Base and LoRA |
Any |
>= 172 |
NVFP4 |
2 |
1 |
2 |
Base only |
Any |
>= 90 |
NVFP4 |
4 |
1 |
4 |
Base only |
Any |
>= 49 |
NVFP4 |
8 |
1 |
8 |
Base only |
LoRA profiles are available only for rows marked Base and LoRA.
Profile Topologies#
GPU-specific profiles use the following tensor-parallel (TP) and pipeline-parallel (PP) values:
NVIDIA-H100-80GB-HBM3: BF16 TP=8, PP=2 across two nodes (16 GPUs total). Requires inter-node InfiniBand or RoCE networking. NVFP4 TP=8, PP=1 on a single node (8 GPUs).
NVIDIA-H200: BF16 TP=8, PP=1 on a single node (8 GPUs); NVFP4 TP=4 or TP=8, PP=1 on a single node.
NVIDIA-B200: BF16 TP=8, PP=1 on a single node (8 GPUs); NVFP4 TP=2, TP=4, or TP=8 on a single node.
NVIDIA-B300-SXM6-AC: BF16 TP=4 or TP=8 on a single node; NVFP4 TP=2, TP=4, or TP=8 on a single node.
NVIDIA-GB200: BF16 TP=4, PP=2 across two NVL trays of four GPUs. Requires inter-node InfiniBand or RoCE networking. NVFP4 TP=2 or TP=4 is supported on a single node.
NVIDIA-GB300: BF16 TP=4 on a single node; NVFP4 TP=2 or TP=4 on a single node.
The Any rows are GPU-memory-gated profiles that are available when each GPU satisfies the listed min_vram_per_device_gb gate:
BF16 TP=4, PP=1 requires at least 270 GB per GPU.
BF16 TP=8, PP=1 requires at least 139 GB per GPU.
NVFP4 TP=2, PP=1 requires at least 172 GB per GPU.
NVFP4 TP=4, PP=1 requires at least 90 GB per GPU.
NVFP4 TP=8, PP=1 requires at least 49 GB per GPU.
For more information about deploying NIM across multiple nodes, refer to Multi-Node Deployment.