Support Matrix#

For production, Holoscan for Media is designed to run on NVIDIA-Certified Systems. For full hardware, networking, and cluster platform requirements, refer to the System Requirements for Holoscan for Media.

Models#

Model name

Model ID

Publisher

Active Speaker Detection

active-speaker-detection

NVIDIA

Supported Video Resolutions#

Resolution

Width × Height

Notes

4K UHD

3840 × 2160

Supported

1080p

1920 × 1080

Default

720p

1280 × 720

Supported

Set the resolution via video.width and video.height. The upstream sender and downstream receiver must be configured with the same resolution.

Optimized GPU Configurations#

GPU

Precision

L40S

FP16

NVIDIA RTX PRO 6000 Blackwell Server Edition

FP16

Pod Resource Requirements#

The following resources are required per Active Speaker Detection NIM media function pod:

Resource

Requirement

CPU

12 cores

Memory

8 GiB

Hugepages (2 Mi)

8 GiB

NVIDIA GPU

1

Ensure the target node has sufficient resources and that the GPU and hugepages are configured before deploying. Node selector configuration is covered in each chart’s installation page.

Software#

NVIDIA Driver#

Prerequisite

Version

Reference

NVIDIA Graphics Driver for Linux

571.21+

NVIDIA Unix Drivers

Component Versions#

The Active Speaker Detection NIM on Holoscan for Media uses the following software components:

Component

Version

CUDA

12.8.1

cuDNN

9.7.1.26

TensorRT

10.9.0.34

Triton Inference Server

v2.56.0

DeepStream

8.0

NVIDIA Media Gateway

0.7.0