Metropolis Microservices

NVIDIA® Metropolis Microservices is a suite of cloud-native microservices and reference AI workflows to fast-track the development and deployment of vision AI applications from edge to cloud.

Metropolis Microservices

Metropolis Microservices provides customizable cloud-native building blocks to build vision AI applications that unlock business insights for a wide range of spaces, ranging from roadways to airports to retail stores. The microservices are brought to life with reference AI workflows that understand people flow, estimate building occupancy, create heatmaps, and more. With Metropolis Microservices, it is easy to build, test, and scale deployments from edge to cloud with enhanced resilience and industrial best practice security.

Problems It Can Help Solve

Metropolis Microservices accelerates the development of solutions and services for smart infrastructure, retail and logistics, smart hospitals, factories, and more. It is designed for software and system developers who are looking to generate business insights using AI, especially at scale. Some of the common use cases:

  1. Queue Analytics: Analyze queues at an airport, or at a retail checkout lane.

  2. Retail Analytics: Analyze shopper dwell time, shopper trajectory, & heat map, and help with loss prevention.

  3. Autonomous Retail: Uniquely identifying and tracking individuals across cameras without using any personally identifiable information (PII).

  4. Occupancy Analytics: Compute people count, flow (with direction), heatmap, and line-crossing, as well as detect events in user-defined regions of interest.

  5. Anomaly Detection and Alert Generation: Detect anomalies like entrance in prohibited areas, loitering, wrong-way movement, and so on.

  6. Smart Self-Checkout: Augment existing kiosks with few-shot vision capabilities to reduce loss.

Key Features

  1. Uniquely track objects and people across cameras with advanced new transformer-based detection and ReID models for significantly improved accuracy and robustness.

  2. Complete end-to-end system for Few-shot learning for self-checkout use cases to learn and detect unseen retail objects.

  3. Fully modularized microservice architecture that can be deployed, managed, and scaled, using Kubernetes and Helm, with edge-to-cloud connectivity features for seamless streaming and inferencing using WebRTC and OpenVPN.

  4. Visualize raw and filtered data in Kibana with enhanced user interfaces offering new visualization options like thumbnails for event cards and polygonal field of view drawings.

  5. Pre-built analytics to calculate object speed, distance, line-crossing, and region of interest (ROI), with Single-View 3D Tracking (SV3DT) for enhanced performance.

  6. Browser-based toolkit for camera calibration and creating ROIs in the image plane.

  7. Query and extract metadata using REST APIs, with upgraded underlying systems supporting dynamic configurations and on-the-fly updates.

  8. Extract live and recorded media data from the VMS and play in the UI, alongside dynamic camera stream management in Docker and Kubernetes deployments.

  9. Fully customizable reference AI workflows, such as Multi-Camera Tracking with new Real-Time Location System (RTLS) mode, Occupancy Analytics, and Few-Shot Product Recognition, plus new workflows and guides leveraging digital twin and synthetic data generation.

  10. Pretrained models for person Re-ID, retail object detection, and retail object recognition. Models can be fine-tuned on custom datasets using NVIDIA TAO Toolkit, with data-driven automatic accuracy tuning via NVIDIA PipeTuner.

What’s Inside?

The collection currently includes 6 workflows, with more being added. It also features more than 10 proprietary microservices and over a dozen microservices based on open-source, industry-standard software such as Kafka.

Helm Charts & Containers

All microservices and reference applications can be orchestrated using Kubernetes. To simplify deployment, all NVIDIA and open-source microservices are provided as Docker containers, and nearly all have accompanying Helm charts.

Data File Package

Data is sensitive, so we ask you to read our terms and license from our EULA. If you download this data, we might ask you to delete/destroy all data in the future. All data are provided for demo purposes only and intended to be used with the respective reference workflows.

Current version is packaged with the following data:

  • 7 videos capturing people movement from 7 cameras, as well as the perception metadata from processing those videos. This dataset is from a real small warehouse scene and is used with the Multi-Camera Tracking workflow.

  • 8 videos capturing people movement from 8 cameras, as well as the perception metadata from processing those videos. This dataset is from a Omniverse virtual warehouse and is used with the Real Time Location System workflow.

  • 3 videos capturing people movement outside and inside of two cafeterias, as well as the perception metadata from processing those videos. These data are to be used with the Occupancy Analytics & Occupancy Heatmap workflows.

  • 1 video of a person scanning various retail objects. This video is to be used with the Few-Shot Product Recognition workflow.

How to Get Started?

There are 6 reference workflows to help get started with Metropolis Microservices. Developers should first try at least one, or all, of the workflows and get familiar with all the features and capabilities. Developers should also try to run the apps with their own videos or use the Kibana dashboard for data exploration. Once familiar with the key features and microservices, developers can explore modifying the workflows for custom use cases and rules, such as creating/modifying UI or building complete customized solutions.


Quickstart Setup

Reference Workflow Quickstart Guides

  1. Multi-Camera Tracking

  2. Real Time Location System

  3. Multi-Camera Sim2Deploy

  4. Occupancy Analytics

  5. Occupancy Heatmap

  6. Few-Shot Product Recognition


  • MTMC: Multi-Target Multi-Camera

  • RTLS: Real Time Location System

  • FSL: Few-Shot Learning

  • DS: DeepStream

  • E2E: End-to-end

  • VMS: Video management system

  • K8s: Kubernetes