Is this page helpful?

Overview#

The NVIDIA Synthetic Video Detector NIM is a GPU-accelerated microservice that analyzes video content and estimates the likelihood that a video has been AI-generated (“synthetic”) versus real. The service consumes an input video and produces per-frame detection scores along with an aggregated video-level probability, enabling reliable assessment at both fine-grained and holistic levels.

The service employs a forensic-oriented detection strategy that focuses on identifying intrinsic artifacts introduced by generative video pipelines. Rather than relying on semantic content, the detector analyzes low-level statistical and frequency-domain traces that are characteristic of synthetic generation.

These signals are resilient to common video transformations, including compression and re-encoding, and generalize across different generative architectures.

Designed for high-throughput and low-latency deployment, the service supports use cases such as media authenticity verification, digital forensics, content moderation, and integrity monitoring in real-world environments.

Inputs and Outputs#

Input
- H.264-encoded MP4 video. The service decodes the video into RGB frames internally using GPU-accelerated video decoding.
Per-frame or per-segment output (optional)
- Synthetic likelihood scores associated with individual frames or temporal segments, which are useful for analysis and visualization.
Final aggregated output
- A single probability in ([0, 1]) that summarizes how likely the entire video is synthetic.
- Optional summary artifacts (for example, CSV or JSON) for offline analysis.

High-Level Service Architecture#

At a high level, the service consists of the following:

gRPC front-end: A bi-directional streaming gRPC API that accepts video bytes and streams detection results back to the client.
Video decoding layer: A GPU-accelerated decoding pipeline that converts H.264 video into normalized RGB frames suitable for model inference.
Inference engine: A batched inference layer that runs frames through the DINOv2 + DINOv3 ensemble using TensorRT-optimized engines.
Results and reporting: Video classification and per-frame analysis.

Try It Out#

Try the NVIDIA Synthetic Video Detector NIM at build.nvidia.com/nvidia/synthetic-video-detector.

To experience the NVIDIA Synthetic Video Detector NIM API without having to host your own servers, use the Try API feature, which uses the NVIDIA Cloud Function backend.