Support Matrix#

Hardware#

Unless specified otherwise, NVIDIA NIM for vision language models (VLMs) should, but are not guaranteed to, run on any NVIDIA GPU, provided the GPU has sufficient memory. They can also run on multiple homogeneous NVIDIA GPUs with sufficient aggregate memory and a CUDA compute capability of >= 7.0 (8.0 for bfloat16) unless otherwise specified. For more information, refer to Supported Models.

NVIDIA NIM for VLMs does not support NVIDIA Virtual GPU (vGPU) environments.

For information on the supported operating systems, drivers, and software, refer to the About Get Started page.

Supported Models#

Kimi-K2.6#

Latest supported release tag: 2.0.4-variant

The following section lists the supported configurations for moonshotai/kimi-k2.6 (NGC catalog page).

Generic Configuration#

Kimi-K2.6 is a ~1T-parameter MoE VLM. It ships as INT4 only, with two tensor-parallel profiles selected automatically based on per-GPU memory.

The GPU Memory column is per-GPU HBM in GB; the Disk Space column is the NGC artifact size needed in the NIM cache (one-time download on first launch), in GB.

GPU

GPU Memory

Precision

# of GPUs

Disk Space

H200

141

INT4

8

600

H200 NVL

141

INT4

8

600

B200

192

INT4

8

600

B300 SXM (288 GB)

288

INT4

4 or 8

600

GB300 NVL

288

INT4

4

600