Support Matrix for NVIDIA NIM for Object Detection#

This documentation describes the software and hardware that NVIDIA NIM for Object Detection supports.

Models#

NVIDIA NIM for Object Detection supports the following models:

Model Name	Model ID	Publisher	Model Card
NeMo Retriever Page Elements v3	nvidia/nemoretriever-page-elements-v3	NVIDIA	-
NeMo Retriever Page Elements v2	nvidia/nemoretriever-page-elements-v2	NVIDIA	Link
NeMo Retriever Graphic Elements v1	nvidia/nemoretriever-graphic-elements-v1	NVIDIA	Link
NeMo Retriever Table Structure v1	nvidia/nemoretriever-table-structure-v1	NVIDIA	Link
NeMo Retriever YOLOX Page Elements v1	nvidia/nv-yolox-page-elements-v1	NVIDIA	Link

Optimized vs Non Optimized Models#

The following models are optimized using TRT and are available as pre-built, optimized engines on NGC. These optimized models are GPU specific and require a minimum GPU memory value as specified in the Optimized configuration sections of each model.

NVIDIA also provides generic model profiles that operate with any NVIDIA GPU (or set of GPUs) with sufficient memory capacity. These generic profiles are known as non-optimized configuration. On systems where there are no compatible optimized profiles, generic profiles are chosen automatically. Optimized profiles are preferred over generic profiles when available, but you can choose to deploy a generic profile on any system by following the steps in the Overriding Profile Selection section.

Compute Capability and Automatic Profile Selection#

NVIDIA NIM for Object Detection supports TensorRT engines that are compiled with the option kSAME_COMPUTE_CAPABILITY. This option builds engines that are compatible with GPUs having the same compute capability as the one on which the engine was built. For more information, refer to Same Compute Capability Compatibility Level.

To see the mapping of CUDA GPU compute capability versions to supported GPU SKUs, refer to CUDA GPU Compute Capability. If you run a NIM on a GPU that has the same compute capability as one of the engines, then that engine should appear as compatible when you run list-model-profiles.

Automatic profile selection uses the following order to choose a profile:

A GPU-specific engine (for example, gpu:NVIDIA B200)
A compute capability engine (for example, compute_capability:10.0)
ONNX or Pytorch(for example, model_type:onnx)

Note: Certain NIMs may include both GPU-specific engines and compute capability engines, while others may include only a single engine type.

Supported Hardware#

NeMo Retriever Page Elements v3#

Optimized configuration#

Compute Capability	Precision
12.0	FP16
10.0	FP16
9.0	FP16
8.9	FP16
8.6	FP16
8.0	FP16

Non-optimized configuration#

The GPU Memory and Disk Space values are in GB; Disk Space is for both the container and the model.

GPUs	GPU Memory	Precision	Disk Space
Any single NVIDIA GPU that has sufficient memory, or multiple homogenous NVIDIA GPUs that have sufficient memory in total.	1	FP16	12

If you run this model on RTX 40xx or later, you need a minimum of 8GB of VRAM.

NeMo Retriever Page Elements v2#

Optimized configuration#

Compute Capability	Precision
12.0	FP16
10.0	FP16
9.0	FP16
8.9	FP16
8.6	FP16
8.0	FP16

Non-optimized configuration#

The GPU Memory and Disk Space values are in GB; Disk Space is for both the container and the model.

GPUs	GPU Memory	Precision	Disk Space
Any single NVIDIA GPU that has sufficient memory, or multiple homogenous NVIDIA GPUs that have sufficient memory in total.	1	FP16	12

If you run this model on RTX 40xx or later, you need a minimum of 8GB of VRAM.

NeMo Retriever Graphic Elements v1#

Optimized configuration#

Compute Capability	Precision
12.0	FP16
10.0	FP16
9.0	FP16
8.9	FP16
8.6	FP16
8.0	FP16

Non-optimized configuration#

The GPU Memory and Disk Space values are in GB; Disk Space is for both the container and the model.

GPUs	GPU Memory	Precision	Disk Space
Any single NVIDIA GPU that has sufficient memory, or multiple homogenous NVIDIA GPUs that have sufficient memory in total.	1	FP16	12

If you run this model on RTX 40xx or later, you need a minimum of 8GB of VRAM.

NeMo Retriever Table Structure v1#

Optimized configuration#

Compute Capability	Precision
12.0	FP16
10.0	FP16
9.0	FP16
8.9	FP16
8.6	FP16
8.0	FP16

Non-optimized configuration#

The GPU Memory and Disk Space values are in GB; Disk Space is for both the container and the model.

GPUs	GPU Memory	Precision	Disk Space
Any single NVIDIA GPU that has sufficient memory, or multiple homogenous NVIDIA GPUs that have sufficient memory in total.	1	FP16	12

If you run this model on RTX 40xx or later, you need a minimum of 8GB of VRAM.

NeMo Retriever YOLOX Page Elements v1#

Optimized configuration#

GPU	GPU Memory (GB)	Precision
A100 SXM4	40 & 80	FP16
H100 HBM3	80	FP16
H100 NVL	80	FP16
L40s	48	FP16
A10G	24	FP16
L4	24	FP16
GeForce RTX 4090 (Beta)	24	FP16
NVIDIA RTX 6000 Ada Generation (Beta)	48	FP16
GeForce RTX 5080 (Beta)	16	FP16
GeForce RTX 5090 (Beta)	32	FP16

Non-optimized configuration#

The GPU Memory and Disk Space values are in GB; Disk Space is for both the container and the model.

GPUs	GPU Memory	Precision	Disk Space
Any single NVIDIA GPU that has sufficient memory, or multiple homogenous NVIDIA GPUs that have sufficient memory in total.	1	FP16	12

Software#

NVIDIA Driver#

Releases prior to 1.1.0 use Triton Inference Server 24.05. For Triton on NVIDIA driver support, refer to the Release Notes.

Release 1.1.0 uses Triton Inference Server 25.01. Please refer to the Release Notes for Triton on NVIDIA driver support.

Ensure that the latest compatible NVIDIA driver is installed on your system before launching NIM containers. If you experience issues starting the containers, verify that your driver is up-to-date.

NVIDIA Container Toolkit#

Your Docker environment must support NVIDIA GPUs. Please refer to the NVIDIA Container Toolkit for more information.