Support Matrix for NVIDIA NIM for Object Detection#
This documentation describes the software and hardware that NVIDIA NIM for Object Detection supports.
Models#
NVIDIA NIM for Object Detection supports the following models:
Model Name |
Model ID |
Publisher |
Model Card |
---|---|---|---|
NeMo Retriever Page Elements v2 |
nvidia/nemoretriever-page-elements-v2 |
NVIDIA |
|
NeMo Retriever Graphic Elements v1 |
nvidia/nemoretriever-graphic-elements-v1 |
NVIDIA |
|
NeMo Retriever Table Structure v1 |
nvidia/nemoretriever-table-structure-v1 |
NVIDIA |
|
NeMo Retriever YOLOX Page Elements v1 |
nvidia/nv-yolox-page-elements-v1 |
NVIDIA |
Optimized vs Non Optimized Models#
The following models are optimized using TRT and are available as pre-built, optimized engines on NGC. These optimized models are GPU specific and require a minimum GPU memory value as specified in the Optimized configuration sections of each model.
NVIDIA also provides generic model profiles that operate with any NVIDIA GPU (or set of GPUs) with sufficient memory capacity. These generic profiles are known as non-optimized configuration. On systems where there are no compatible optimized profiles, generic profiles are chosen automatically. Optimized profiles are preferred over generic profiles when available, but you can choose to deploy a generic profile on any system by following the steps in the Overriding Profile Selection section.
Supported Hardware#
NeMo Retriever Page Elements v2#
Optimized configuration#
GPU |
GPU Memory (GB) |
Precision |
---|---|---|
A100 SXM4 |
40 & 80 |
FP16 |
H100 HBM3 |
80 |
FP16 |
H100 NVL |
80 |
FP16 |
L40s |
48 |
FP16 |
A10G |
24 |
FP16 |
L4 |
24 |
FP16 |
B200 |
180 |
FP16 |
Non-optimized configuration#
The GPU Memory and Disk Space values are in GB; Disk Space is for both the container and the model.
GPUs |
GPU Memory |
Precision |
Disk Space |
---|---|---|---|
Any single NVIDIA GPU that has sufficient memory, or multiple homogenous NVIDIA GPUs that have sufficient memory in total. |
1 |
FP16 |
12 |
If you run this model on RTX 40xx or later, you need a minimum of 8GB of VRAM.
NeMo Retriever Graphic Elements v1#
Optimized configuration#
GPU |
GPU Memory (GB) |
Precision |
---|---|---|
A100 SXM4 |
40 & 80 |
FP16 |
H100 HBM3 |
80 |
FP16 |
H100 NVL |
80 |
FP16 |
L40s |
48 |
FP16 |
A10G |
24 |
FP16 |
L4 |
24 |
FP16 |
B200 |
180 |
FP16 |
Non-optimized configuration#
The GPU Memory and Disk Space values are in GB; Disk Space is for both the container and the model.
GPUs |
GPU Memory |
Precision |
Disk Space |
---|---|---|---|
Any single NVIDIA GPU that has sufficient memory, or multiple homogenous NVIDIA GPUs that have sufficient memory in total. |
1 |
FP16 |
12 |
If you run this model on RTX 40xx or later, you need a minimum of 8GB of VRAM.
NeMo Retriever Table Structure v1#
Optimized configuration#
GPU |
GPU Memory (GB) |
Precision |
---|---|---|
A100 SXM4 |
40 & 80 |
FP16 |
H100 HBM3 |
80 |
FP16 |
H100 NVL |
80 |
FP16 |
L40s |
48 |
FP16 |
A10G |
24 |
FP16 |
L4 |
24 |
FP16 |
B200 |
180 |
FP16 |
Non-optimized configuration#
The GPU Memory and Disk Space values are in GB; Disk Space is for both the container and the model.
GPUs |
GPU Memory |
Precision |
Disk Space |
---|---|---|---|
Any single NVIDIA GPU that has sufficient memory, or multiple homogenous NVIDIA GPUs that have sufficient memory in total. |
1 |
FP16 |
12 |
If you run this model on RTX 40xx or later, you need a minimum of 8GB of VRAM.
NeMo Retriever YOLOX Page Elements v1#
Optimized configuration#
GPU |
GPU Memory (GB) |
Precision |
---|---|---|
A100 SXM4 |
40 & 80 |
FP16 |
H100 HBM3 |
80 |
FP16 |
H100 NVL |
80 |
FP16 |
L40s |
48 |
FP16 |
A10G |
24 |
FP16 |
L4 |
24 |
FP16 |
GeForce RTX 4090 (Beta) |
24 |
FP16 |
NVIDIA RTX 6000 Ada Generation (Beta) |
48 |
FP16 |
GeForce RTX 5080 (Beta) |
16 |
FP16 |
GeForce RTX 5090 (Beta) |
32 |
FP16 |
Non-optimized configuration#
The GPU Memory and Disk Space values are in GB; Disk Space is for both the container and the model.
GPUs |
GPU Memory |
Precision |
Disk Space |
---|---|---|---|
Any single NVIDIA GPU that has sufficient memory, or multiple homogenous NVIDIA GPUs that have sufficient memory in total. |
1 |
FP16 |
12 |
Software#
NVIDIA Driver#
Releases prior to 1.1.0 use Triton Inference Server 24.05. For Triton on NVIDIA driver support, refer to the Release Notes.
Release 1.1.0 uses Triton Inference Server 25.01. Please refer to the Release Notes for Triton on NVIDIA driver support.
If issues arise when you start the NIM containers, run the following code to ensure that the latest NVIDIA drivers are installed.
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
NVIDIA Container Toolkit#
Your Docker environment must support NVIDIA GPUs. Please refer to the NVIDIA Container Toolkit for more information.