Agents#

The Warehouse Blueprint incorporates an agentic AI system that provides natural language interaction capabilities for querying warehouse safety incidents, generating reports, and retrieving visual information from warehouse cameras.

The suggested agent to use is NVIDIA Nemotron Nano 9B v2 (nvidia/nvidia-nemotron-nano-9b-v2) for natural language understanding and query routing, and NVIDIA Cosmos Reason2 8B (nvidia/cosmos-reason2-8b) for video understanding and analyzing warehouse safety incidents.

For detailed information on all components, APIs, and customization options, refer to the VSS Agents.

Agent Architecture#

The agent system uses a two-tier hierarchy:

Top Agent: LLM-powered routing agent that interprets queries, decides which tools or sub-agents to invoke, maintains conversation context, and streams reasoning traces.
Sub-Agents: Specialized agents for different types of reporting tasks:
- Report Agent: Handles detailed, comprehensive reports for single incidents. Retrieves incident data, performs video analysis using VLMs, and generates structured markdown reports with incident details, location information, people and vehicles involved.
- Multi-Report Agent: Handles listing and summarizing multiple incidents. Supports filtering by time range, sensor, and incident count. Provides incident summaries with optional chart generation for visualizing incident trends.

For detailed agent architecture information, see Agent Overview

Tools#

The Top Agent has direct access to a small set of tools and routes more complex requests to the Report and Multi-Report sub-agents, which in turn use the Video Analytics, Video Storage, and Report Generation tools listed below. For the canonical tool reference, see Video-Analytics-MCP Server.

Top Agent Tools

Tool	Description
`vst_sensor_list`	Lists available sensors/cameras from VIOS (VST)
`vst_picture_url`	Retrieves a snapshot/picture URL for a sensor (live or at a specified timestamp)
`get_fov_counts_with_chart`	Provides occupancy statistics with histogram visualizations

Video Analytics Tools

Tool	Description
`video_analytics_mcp.video_analytics__get_incidents`	Retrieves multiple incidents filtered by sensor, place, time range, and VLM verdict
`video_analytics_mcp.video_analytics__get_incident`	Retrieves a specific incident by incident ID
`video_analytics_mcp.video_analytics__get_fov_histogram`	Gets field-of-view occupancy histogram data
`video_analytics_mcp.video_analytics__get_sensor_ids`	Lists all available sensors/cameras (filtered by place when applicable)

Video Storage Tools

Tool	Description
`vst_video_url` (`vst.video_clip`)	Retrieves video clip URLs for incident playback, with optional bounding-box overlay
`vst_picture_url` (`vst.snapshot`)	Retrieves snapshot/picture URLs at a given timestamp, with optional bounding-box overlay
`vst_sensor_list` (`vst.sensor_list`)	Returns the list of sensors registered in VIOS

Report Generation Tools

Tool	Description
`video_understanding`	Uses the VLM to analyze video frames and extract incident information
`template_report_gen`	Combines incident data and VLM observations into a structured markdown report using a template and the LLM
`chart_generator`	Creates visualization charts for reports
`multi_incident_formatter`	Formats multiple incidents (with video/snapshot URLs and optional charts) for summary reports

Sub-Agents

Sub-Agent	Description
`report_agent`	Generates a detailed report for a single incident (uses `video_analytics_mcp.video_analytics__get_incident`, `video_analytics_mcp.video_analytics__get_incidents`, and `template_report_gen`)
`multi_report_agent`	Lists and summarizes multiple incidents (uses `multi_incident_formatter`, which in turn calls the incidents, video URL, picture URL, and chart generator tools)

Supported Queries#

Query Type	Examples
Sensor Discovery	`List all available sensors`
Snapshots	`Take a snapshot of Camera_01`
Incident Listing	`List last 5 incidents for Camera_01`, `Show incidents in the last 24 hours`
Reports	`Generate a report for incident 12345`, `Give a report for Camera_01 in last hour`
Occupancy Metrics	`How many people were in Camera_01 20 minutes ago?`
Multi-Step	`List last 5 incidents; generate report for the second one`

The agent understands temporal expressions such as “last 10 minutes”, “yesterday”, and “past hour”.

Note

Camera_01 is an example. Ask List all available sensors to discover your sensor names.

Example: Sensor Discovery

Example: Incident Report with Chart

Example: Camera Snapshot

Warehouse Agent Configuration#

Configuration File Location

The warehouse agent configuration file is located at deploy/docker/industry-profiles/warehouse-operations/vss-agent/configs/config.yml. For detailed configuration options and YAML structure, see Agent Configuration.

Models Used

Model	Type	Purpose
`nvidia/nvidia-nemotron-nano-9b-v2`	LLM	Query routing, reasoning, report generation
`nvidia/cosmos-reason2-8b`	VLM	Video understanding and analysis

Key Warehouse-Specific Settings

The warehouse profile configures the following key components:

Routing prompt: Optimized for warehouse video surveillance incident reporting
VLM prompts: Tuned to analyze incidents, location conditions (lighting, floor, blockages), people (worker/driver/pedestrian), and vehicles (forklift/transporter)
Report template: Warehouse incident report format

For the complete configuration reference including all tools, agents, environment variables, and YAML examples, see Agent Configuration.

For information about other available profiles (Smart Cities, Developer), see Agent Profiles.

Customization#

The agent configuration is designed to be flexible and extensible. Key customization areas for the warehouse deployment include:

Prompt Customization: Modify workflow.prompt to adjust routing logic for warehouse-specific query patterns
VLM Prompt Engineering: Tune vlm_prompts in template_report_gen for different warehouse scenarios
Report Templates: Create custom markdown templates in the templates directory for warehouse-specific report formats

For comprehensive customization guidance including adding custom functions and MCP servers, see Agent Customization.

Agent Skills#

In addition to the in-product VSS Agent described above, a deployed Warehouse Blueprint can be operated from a coding agent (Claude Code, Codex, NemoClaw) using Agent Skills. The following Skills are particularly relevant to the Warehouse Blueprint:

vss-deploy-profile — configure, deploy, debug, and tear down the Warehouse profile via Docker Compose without manual .env editing.
vss-deploy-detection-tracking-2d — operate the RTVI-CV perception microservice for the warehouse-2d and warehouse-3d use cases.
vss-deploy-detection-tracking-3d — deploy and operate the standalone RTVI-CV-3D / MV3DT stack for sample data, custom videos, or RTSP streams. See the MV3DT Agent Skills walkthrough for sample prompts and expected agent steps.
vss-query-analytics — query incidents, sensors, and FOV metrics from Elasticsearch via the VA-MCP server (the same data the Warehouse Agent uses).
vss-setup-behavior-analytics — deploy the vss-behavior-analytics service standalone with custom configurations or calibration.
vss-manage-video-io-storage — manage video and stream operations, recording timelines, clip extraction, and snapshots through VIOS.
vss-generate-video-calibration — calibrate multi-camera datasets with AutoMagicCalib (AMC). See the Auto Calibration Agent Skills walkthrough for sample prompts and expected agent steps.

For the full agent skills catalog and installation instructions, see Agent Skills.

Agent Evaluation#

The VSS Agent includes a comprehensive evaluation framework for assessing agent performance across different dimensions including report quality, question-answering accuracy, and trajectory quality.

For detailed information on configuring evaluators, creating evaluation datasets, and interpreting results, see Agent Evaluation.

Observability#

Warehouse agents support distributed tracing via Phoenix through NeMo Agent Toolkit (NAT) telemetry export.

To enable Phoenix telemetry, configure telemetry in the agent configuration file (see Agent Configuration) and set PHOENIX_ENDPOINT in your environment.

Once telemetry is enabled and your agent is running, access the Phoenix UI at http://<HOST_IP>:7777/phoenix (projects view is typically under /projects).

For a walkthrough of the Phoenix UI (projects view, traces list, and sample trace anatomy), see Observability.

Additional references:

Known Limitations#

For information on warehouse blueprint known limitations, please refer to the Known Limitations section.

For information on general VSS Agent known issues, please refer to the VSS Agent Known Issues section.