Introduction to the NVIDIA Public Safety AI Blueprint#

The NVIDIA Public Safety AI Blueprint provides a set of reference workflows tailored for physical security applications. These workflows help developers quickly build and deploy real‑world applications by extending the core Video Search and Summarization (VSS) capabilities with intelligent alert generation and validation pipelines.

Developers can leverage the entire blueprint or select components to integrate into their custom use cases.

Use Cases#

The Public Safety AI Blueprint currently supports:

Tailgating detection – identify unauthorized individuals following authorized personnel through secure access points

Future releases will expand support for additional physical security scenarios.

Key Components#

At a high level, the blueprint includes the following components:

Vision Language Models (VLM) – Cosmos Reason2 8B for verification of alerts generated by the CV perception pipeline.
Perception and Video Analytics – filter and analyze video content to identify events relevant for VLM review.
Alert Verification Microservice – connect filtered content from the perception pipeline with the VLM for validation.
Video Storage Toolkit (VST) – use as a Video Management System (VMS) or interface with third‑party VMS solutions.
Agent – powered by Nemotron Nano 9B v2 for natural language interaction, enabling queries, report generation and incident lookup.
Scalable Deployment – deploy the solution at scale across multiple cameras and locations.

Architecture Overview#

The following diagram illustrates the end-to-end architecture of the Public Safety Blueprint:

The architecture consists of two main pipelines:

Perception Pipeline (Left)

Cameras feed into VST/VMS, which streams video to the RT-CV (Real-Time Computer Vision) pipeline. The pipeline includes:

Behavior Analytics – Detects tailgating and other security events
Alert Verification – Validates alerts using Cosmos Reason2 8B VLM
Logstash – Routes data to storage services

All components communicate via a Message Bus, with alerts and CV metadata stored in Video Analytics Storage.

Agent Pipeline (Right)

Users interact with the system through an AI Agent powered by Nemotron Nano 9B v2. The Agent connects to various tools via MCP (Model Context Protocol):

Video IO & Storage – Access to VST/VMS for video retrieval
Video Analytics Tool – Query incident data from Video Analytics Storage
Report Generation – Generate incident reports using Cosmos Reason2 8B

Inference Microservices provide the underlying AI capabilities:

nvidia-nemotron-nano-9bv2 – Agent reasoning and natural language understanding
cosmos-reason2-8B – Alert verification and report generation
RT‑DETR detector (Real‑Time Detection Transformer) – Object detection and tracking

The blueprint builds on the VSS foundation and adds specialized components for public safety applications:

Component	Purpose
Perception Pipeline	Detect and filter security events using CV models
Alert Verification Microservice	Route filtered content to VLM for validation
VLM Service	Verify alerts using Cosmos Reason2 8B
Agent	Natural language queries and report generation
VST / VMS Integration	Video management and third‑party VMS connectivity
Dashboard	Visualize alerts and incident metadata

In addition to these public safety features, the blueprint also includes all of the standard VSS capabilities for video ingestion, storage, and streaming.

Prerequisites#

Before deploying the Public Safety AI Blueprint, ensure you have:

Hardware – NVIDIA GPU with sufficient memory for VLM inference (refer to Quickstart Guide for specific requirements)
Software – Docker and NVIDIA Container Toolkit installed
Licenses – Valid licenses for NVIDIA Metropolis components (see License Information)

Next Steps#

Quickstart Guide – Get up and running with a basic deployment
Blueprint Deep Dive – Explore the architecture in detail
Agents – Learn how to configure and customise agents