Alert Verification Microservice#

The Alert Verification Microservice is a modular, configuration-driven service for ingesting alerts generated by video analytics pipelines, verifying them using Vision Language Models (VLMs), and publishing the results to downstream systems.

The service supports alerts in nvschema format (nv.Incident and nv.Behavior) via multiple ingestion methods including Kafka or HTTP API. Upon receiving an alert, the Alert Verification Microservice retrieves the video URL for the alert time window from VIOS based on sensor ID and timestamps, then passes this URL to a VLM-based verification backend (such as NVIDIA Cosmos Reason1 NIM) through an OpenAI-compatible API. The VLM downloads the video and analyzes it against configurable prompts. Alerts are typically generated by video analytics based on events detected within a camera’s field of view.

Primary Workflow

The Alert Verification Microservice follows a structured 4-step workflow:

  1. Alert Ingestion: Alerts are received via HTTP API or Kafka topics. The system validates the message schema, applies normalization, and queues alerts for processing.

  2. Video URL Retrieval: The service extracts sensor ID, start time, and end time from the alert, then retrieves the corresponding video URL from VIOS using REST API.

  3. VLM Analysis: The video URL and a dynamically constructed prompt (based on alert type and context) are sent to the VLM backend. The VLM downloads the video, analyzes it, and returns a verdict with reasoning.

  4. Result Publishing: Verified alerts are persisted to Elasticsearch and optionally published to Kafka (enabled via configuration). Results include the original alert fields plus verification metadata.

Key Features#

  • Event-Driven Processing: Consumes alerts from Kafka with configurable consumer groups and offset management.

  • HTTP API Support: RESTful endpoints for direct alert submission (POST /api/v1/alerts for behaviors, POST /api/v1/incidents for incidents).

  • nvschema Compatibility: Native support for nv.Incident and nv.Behavior message formats in both Protobuf and JSON.

  • Automatic Video Resolution: Retrieves video segments from VIOS based on alert timestamps and sensor information.

  • Configurable Prompts: Alert-type-specific prompts with template placeholders for dynamic content injection from alert payloads.

  • Configurable VLM Parameters: Support to Tune video preprocessing and inference settings to match deployment requirements and VLM backend capabilities.

  • Structured VLM Responses: Standardized response format with verdict (confirmed/rejected/unverified), reasoning trace, and HTTP-like status codes.

  • Flexible Output Options: Persist results to Elasticsearch for analytics, with optional Kafka publishing.

  • Custom Category Names: Configure user-friendly display names for alert categories in output (e.g., "collision" displays as "Vehicle Collision" in Elasticsearch and UI).

  • Alert Enrichment: Optionally generate detailed event descriptions after successful verification, providing additional context such as objects involved, sequence of events, and environmental factors.

Setup#

The Alert Verification Microservice supports the following deployment options:

  • Docker Compose deployment

  • Kubernetes deployment

Note

For detailed deployment instructions, refer to the Docker Compose Deployment and Kubernetes Deployment sections.

Prerequisites

Before deploying the Alert Verification Microservice, ensure the following requirements are met:

  • A valid NGC API key is required for accessing NVIDIA services.

  • A VLM backend (e.g., NVIDIA Cosmos Reason1 NIM) must be deployed and accessible.

  • VIOS service must be available for video retrieval.

  • Kafka is required if using event-driven ingestion.

  • Elasticsearch (optional) for result persistence and analytics.

Configuring Prompts#

Prompts control how the VLM analyzes video content for each alert type. They are configured via a JSON file (alert_type_config.json) that is loaded at startup and stored in RedisJSON for runtime access.

Note

Prompt configuration changes require a container restart in the current release. API-based prompt management is not supported.

Prompt Structure#

Each alert type requires a configuration with the following components:

  • user (mandatory): The primary instruction describing how to analyze the video and what decision to make.

  • system (optional): Contextual setup that shapes the VLM’s behavior, tone, and domain knowledge.

  • enrichment (optional): A prompt to generate detailed event descriptions after successful verification. When defined, a second VLM call provides additional context such as objects involved, sequence of events, and environmental factors.

  • output_category (optional): A user-friendly display name for the alert category in output. When specified, this name appears in Elasticsearch and downstream systems instead of the internal alert_type value (e.g., "collision" displays as "Vehicle Collision").

Prompts are stored in RedisJSON at startup and selected based on the alert type during processing.

Prompt Templating#

Prompts support dynamic placeholders that are substituted with values from the alert payload at runtime:

  • Placeholders use {field.path} syntax with dot notation for nested fields.

  • Examples: {info.primaryObjectId}, {place.name}, {objectIds}

  • Missing fields render as <missing:field.path> for debugging.

  • Array values are automatically joined as comma-separated strings.

Alert Type Matching#

The category field in the incoming alert is matched against the alert_type in the prompt configuration. When a match is found, the corresponding prompts are selected and rendered with alert context.

VLM Response Format#

The VLM must respond using the following format for proper parsing:

<think>
reasoning trace here
</think>
<answer>
verdict (A/B or true/false)
</answer>

The answer section should contain a clear verdict that the system can map to confirmed/rejected status.

Example Configuration#

Below is an example prompt configuration from alert_type_config.json:

{
  "version": "1.0",
  "alerts": [
    {
      "alert_type": "collision",
      "output_category": "Vehicle Collision",
      "prompts": {
        "system": "You are an expert AI assistant for video analysis. Your task is to determine whether a surveillance video depicts a **collision event** or **no collision**, based on the definitions below...",
        "user": "Based on the video, which category best describes what occurred at {place.name}:\n(A) Collision (physical contact or impact detected)\n(B) No collision (no contact or impact)...",
        "enrichment": "Provide a detailed description of this collision event. Include: 1) Objects/vehicles involved and their identifiers 2) Sequence of events leading to the collision 3) Estimated speeds and trajectories 4) Visible damage or safety concerns 5) Environmental factors such as lighting, weather, or obstacles"
      }
    },
    {
      "alert_type": "Stop Anomaly Module",
      "output_category": "Abnormal Vehicle Stop",
      "prompts": {
        "system": "You are an expert AI assistant for video analysis. Your task is to determine whether a surveillance video depicts **normal stopping behavior** or a **stop anomaly**, based on the strict definitions below...",
        "user": "Based on the video, which category best describes the observed behavior at {place.name} (if available):\n(A) Stop anomaly (unexpected or unsafe stop)\n(B) Normal stop (safe and expected behavior)..."
      }
    }
  ]
}

In this example:

  • The collision alert type uses output_category to display as "Vehicle Collision" in output, and includes an enrichment prompt for detailed event descriptions.

  • The Stop Anomaly Module uses output_category but no enrichment prompt (enrichment is optional per alert type).

Alert Messages and Schema Format#

The Alert Verification Microservice supports nvschema message formats for both alerts and incidents:

  • nv.Incident: For incident-type events requiring verification.

  • nv.Behavior: For behavior/alert-type events requiring verification.

Both Protobuf and JSON formats are supported for message serialization.

Note

For the complete schema definitions of nv.Incident and nv.Behavior messages, refer to the Protobuf Schema documentation.

Alert Ingestion Endpoints#

HTTP API

  • POST /api/v1/alerts: Submit alerts in nv.Behavior format (JSON or Protobuf).

  • POST /api/v1/incidents: Submit incidents in nv.Incident format (JSON or Protobuf).

Responses: 202 Accepted (queued), 422 (validation error), 415 (unsupported media type), 500 (internal error).

Kafka Topics (configurable)

  • Input: mdx-incidents (nv.Incident), mdx-alerts (nv.Behavior)

  • Output: mdx-vlm-incidents, mdx-vlm-alerts (when Kafka sink enabled)

Key Fields#

The following table describes key fields in alert messages:

Field

Description

sensorId

Camera or sensor identifier used for video retrieval.

timestamp

Event start time in ISO 8601 format.

end

Event end time in ISO 8601 format.

objectIds

List of tracked object IDs involved in the event.

category

Alert type used for prompt matching (e.g., “collision”, “Stop Anomaly Module”).

place.name

Location identifier (e.g., intersection name, zone).

info

Additional metadata; extended with VLM verification results in responses.

Verification Response#

Verified alerts include the original fields plus an extended info section containing:

Field

Description

verification_response_code

HTTP-like numeric code: 200 = success, 4xx = client errors, 5xx = server/VLM errors.

verification_response_status

Error description when non-200 status.

verdict

One of: "confirmed" (alert verified true), "rejected" (alert verified false), "unverified" (verification incomplete due to error).

reasoning

VLM reasoning trace extracted from the <think> section.

Example Request#

{
  "sensorId": "Lafayette_Agnew",
  "timestamp": "2025-09-11T00:08:27.822Z",
  "end": "2025-09-11T00:09:22.122Z",
  "objectIds": [
    "958741182",
    "958750871",
    "958834290",
    "958730631"
  ],
  "place": {
    "name": "city=Montague/intersection=Lafayette_Agnew"
  },
  "analyticsModule": {
    "id": "Collision Detection Module",
    "description": "Potential collision detected between 4 vehicles"
  },
  "category": "collision",
  "isAnomaly": true,
  "info": {
    "location": "42.48837572978232,-90.73894264480816,0.0",
    "primaryObjectId": "958750871"
  }
}

Example Response#

{
  "sensorId": "Lafayette_Agnew",
  "timestamp": "2025-09-11T00:08:27.822Z",
  "end": "2025-09-11T00:09:22.122Z",
  "objectIds": [
    "958741182",
    "958750871",
    "958834290",
    "958730631"
  ],
  "place": {
    "name": "city=Montague/intersection=Lafayette_Agnew"
  },
  "analyticsModule": {
    "id": "Collision Detection Module",
    "description": "Potential collision detected between 4 vehicles"
  },
  "category": "collision",
  "isAnomaly": true,
  "info": {
    "location": "42.48837572978232,-90.73894264480816,0.0",
    "primaryObjectId": "958750871",
    "verification_response_code": "200",
    "verification_response_status": "OK",
    "reasoning": "The video shows vehicle 958750871 approaching the intersection...",
    "verdict": "confirmed"
  }
}

Scaling and Performance Tuning#

The Alert Verification Microservice supports concurrent processing to maximize throughput. This section describes the key configuration parameters for scaling the service to meet your deployment requirements.

Concurrency Model#

The Alert Verification Microservice uses a main thread with a fixed-size worker pool for concurrent processing:

  1. Main Thread: Polls alerts from Kafka (or receives via HTTP API) and dispatches them to available workers.

  2. Worker Threads: Each worker processes an assigned alert through the full pipeline:

    • Retrieves video URL from VIOS for the alert time window

    • Renders the prompt with alert context

    • Passes video URL, prompts and other required inputs to VLM backend for verification

    • Publishes the verified result to configured sinks

Workers operate independently, allowing multiple alerts to be processed in parallel. The main thread blocks when all workers are busy, providing natural backpressure. The end-to-end latency of external services (VLM, VIOS) directly bounds throughput per worker.

Key Scaling Parameters#

The following parameters control concurrency and throughput:

Parameter

Default

Impact

alert_agent.num_workers

1

Number of concurrent worker threads. Increasing this value scales alert processing throughput roughly linearly until constrained by CPU/memory or downstream service limits (VLM, VIOS).

alert_agent.chunk_size

1

Number of alerts processed per batch within a worker. Higher values reduce overhead but increase memory usage.

kafka.max_poll_records

10

Maximum records fetched per Kafka poll. Controls batch size for ingestion; higher values improve throughput but increase memory and latency variance.

kafka.poll_timeout

100

Kafka consumer poll wait timeout (ms). Lower values reduce latency for sparse traffic; higher values reduce CPU usage.

kafka.max_poll_interval_ms

60000

Maximum interval between polls before consumer is considered failed. Must exceed worst-case VLM processing time multiplied by batch size.

Scaling Guidelines#

Vertical Scaling (Single Instance)

  1. Increase ``num_workers``: Start with the number of CPU cores available. Monitor CPU utilization and increase until VLM backend becomes the bottleneck.

  2. Tune ``chunk_size``: For high-volume deployments, increase to 2-4 to reduce per-message overhead. Keep at 1 for low-latency requirements.

  3. Adjust ``max_poll_records``: Match to num_workers × chunk_size for optimal batching. Avoid setting too high to prevent memory pressure.

Horizontal Scaling (Multiple Instances)

  1. Kafka Consumer Groups: Multiple Alert Verification Microservice instances can share the same event_bridge.kafka_source.group_id to distribute load across partitions.

  2. Partition Alignment: Ensure Kafka topic partitions >= number of consumer instances for effective load distribution.

  3. Stateless Design: The Alert Verification Microservice is stateless; scale horizontally by adding replicas behind Kubernetes or Docker instances.

Performance Tuning Example#

For a deployment targeting 10 alerts/second with average VLM latency of 0.5 seconds:

alert_agent:
  num_workers: 5         # 10 alerts/sec × 0.5 sec latency = 5 concurrent
  chunk_size: 1          # Process one alert at a time for consistent latency

kafka:
  max_poll_records: 5    # Match worker count
  poll_timeout: 100      # Low latency polling
  max_poll_interval_ms: 30000   # 30 seconds sufficient for fast VLM

Note

Monitor the following metrics to identify bottlenecks:

  • Worker utilization (all workers busy = scale up)

  • VLM request latency and error rates

  • Kafka consumer lag (growing lag = scale up or optimize)

  • Memory and CPU utilization

Configuration#

The Alert Verification Microservice is configured via a YAML-based configuration file. The following tables describe key configuration parameters organized by category.

VIOS Configuration#

Parameter

Default

Description

vst_config.base_url

http://localhost:30888

Base URL for VIOS APIs.

vst_config.sensor_list_endpoint

/vst/api/v1/sensor/streams

Endpoint for retrieving sensor/stream list.

vst_config.segment_anchor

end

Segment anchor mode (end for end-anchored window).

vst_config.segment_duration_seconds

10

Clip segment duration in seconds.

vst_config.add_overlay

false

Enable bounding box overlay on video segments.

Kafka Configuration#

Parameter

Default

Description

kafka.bootstrap_servers

localhost:9092

Kafka broker addresses.

kafka.group_id

kafka-incidents-dumper

Default Kafka consumer group ID.

kafka.auto_offset_reset

latest

Offset reset policy (earliest/latest).

kafka.enable_auto_commit

false

Enable automatic offset commit.

kafka.max_poll_records

10

Maximum records per poll.

kafka.max_poll_interval_ms

60000

Maximum interval between polls (ms).

kafka.session_timeout_ms

10000

Session timeout (ms).

kafka.heartbeat_interval_ms

3000

Heartbeat interval (ms).

kafka.poll_timeout

100

Kafka consumer poll wait timeout (ms).

kafka.message_type

Incident

Protobuf message type (Incident or Behavior).

Event Bridge Configuration#

Parameter

Default

Description

event_bridge.sourceType

kafka

Source type (kafka).

event_bridge.sinkType

kafka

Sink type (kafka).

event_bridge.kafka_source.group_id

alert-bridge-vlm-group

Kafka source consumer group.

event_bridge.kafka_source.topics.incident

mdx-incidents

Kafka topic for incidents.

event_bridge.kafka_source.topics.alert

mdx-alerts

Kafka topic for alerts.

VLM Configuration#

Note

For detailed VLM API documentation, refer to the Cosmos Reason1 NIM documentation: TBD

Parameter

Default

Description

vlm.base_url

(required)

OpenAI-compatible VLM endpoint URL.

vlm.model

nvidia/cosmos-reason1-7b

VLM model name.

vlm.max_tokens

4096

Maximum tokens for VLM responses.

vlm.min_pixels

1568

Minimum pixel budget for video frames.

vlm.max_pixels

345600

Maximum pixel budget for video frames.

vlm.num_frames

5

Number of video frames to sample.

vlm.enable_sampling

false

Enable frame sampling mode.

vlm.sampling_fps

4

Sampling FPS when enabled.

Note

The min_pixels and max_pixels parameters must be set in accordance with the VLM’s maximum context window.

Alert Verification Microservice Configuration#

Parameter

Default

Description

alert_agent.num_workers

10

Number of worker threads for concurrent processing.

alert_agent.max_allowed_stream_size

2

Maximum stream size in minutes.

alert_agent.default_stream_interval

1

Default stream interval in minutes.

alert_agent.vst_pass_through_mode

false

Skip VIOS lookup; use local media files directly.

alert_agent.chunk_size

1

Chunk size for processing.

alert_agent.enrichment.enabled

false

Enable alert enrichment for generating detailed event descriptions after verification.

Elasticsearch Configuration#

Parameter

Default

Description

elastic.enabled

true

Enable Elasticsearch persistence.

elastic.hosts

[http://localhost:9200]

Elasticsearch host URLs.

vlm_enhanced_sink.incident.elastic.index

mdx-vlm-incidents

Elasticsearch index for verified incidents.

vlm_enhanced_sink.alert.elastic.index

mdx-vlm-alerts

Elasticsearch index for verified alerts.

Prompt Configuration#

Parameter

Default

Description

alert_type_config_file

alert_type_config.json

Path to alert type configuration file.

prompt.prefer_payload_prompt

false

Prefer prompts from alert payload over stored prompts.

prompt.override_prompts_on_start

true

Override stored prompts with config file on startup.

Logging Configuration#

Parameter

Default

Description

logging.level

DEBUG

Application log level (DEBUG, INFO, WARNING, ERROR, CRITICAL).

logging.format

(see config)

Log message format string.

logging.third_party_level

WARNING

Log level for third-party libraries.

API Reference