Support Matrix#

Models#

Model Name

Model ID

Publisher

Active Speaker Detection

active-speaker-detection

NVIDIA

Optimized Configurations#

GPU

Precision

T4

FP16

A2, A10, A16, A40

FP16

L4, L40, L40s

FP16

B40, NVIDIA RTX PRO 6000 Blackwell Server Edition

FP16

Other Architectures (Consumer RTX GPUs)#

Consumer GPU

Precision

RTX 4090

FP16

RTX 5090, 5080

FP16

The Active Speaker Detection NIM is compatible with professional and consumer GPUs that have Tensor cores and are based on the following NVIDIA architectures: Blackwell, Ada, Ampere, and Turing. The RTX-based GPUs are also supported.

The Active Speaker Detection NIM uses NVDEC hardware acceleration for video decoding.

GPUs without NVDEC hardware support might not be supported.

Some GPUs support only a limited number of concurrent NVDEC sessions, which means the NIM can process only that same number of concurrent inputs on those GPUs.

For details, refer to the Video Encode and Decode Support Matrix.

Software#

NVIDIA Driver and Prerequisite#

NVIDIA driver requirements and other prerequisites for Active Speaker Detection NIM:

Prerequisite

Version

Download and Installation Steps

NVIDIA Graphic Drivers for Linux

571.21+

NVIDIA Unix Drivers

Docker

latest

Ubuntu, CentOS, Debian: Install Docker Engine; Rocky Linux: Docker - Install Engine

NVIDIA Container Toolkit

latest

Installation and configuration instructions

Active Speaker Detection NIM uses the following NVIDIA software platforms:

Components

Version

CUDA

12.8.1

cuDNN

9.7.1.26

TRT

10.9.0.34

Triton Inference Server

v2.56.0

DeepStream

8.0