Quickstart

Overview

Real-Time Location System (RTLS) is a reference worflow or application for video analytics that can track and localize each of the unique people across multiple cameras in real time. Developers can extend this to other camera views such as including outdoor scenarios.

This reference application uses live camera feeds as input, performs object detection, object tracking and Multi-Camera Fusion - RTLS, provides various aggregated analytics functions as API end points, and visualize the results via a browser-based user interface.

Live camera feeds are simulated by streaming video files in RTSP format. Various analytics microservices are connected via Kafka message broker and processed results are saved in database for long-term storage.

The image below shows a visual representation of the RTLS app end-to-end pipeline:

Media Management provides the video streaming and recording functionalities serving all downstream components as input sources. The RSTPs coming out from NVStreamer and VST goes to Perception (DeepStream) where raw meta data with bounding box, tracker id, class type and an embedding vector for each detected object are generated. The raw meta data is transferred through Kafka message broker to Multi-Camera Fusion - RTLS for analytics. The processed results from Multi-Camera Fusion - RTLS along with the raw metadata from Perception are saved in Elasticsearch via Kafka message broker and Logstash data ingestion. Web API queries saved data from Elasticsearch and provides endpoints with various integrated analytics and utilities and Web UI leverages the Web API endpoints and creates a browser based user interface for easy data visualization.

We also provide an optional to deploy the reference application without the heavy GPU dependent modules using pre-extracted metadata as input so that user have an light weight option to explore this reference application. Comparing to the above end-to-end mode, we all this option the playback mode. The image below shows a visual representation of the RTLS app playback mode pipeline:

Quick Deployment

Deploy

To download and setup Metropolis apps, refer to the Quickstart Setup section. For RTLS app, we recommend to use transformer model. Go to metropolis-apps-standalone-deployment/docker-compose/foundational/.env and set MODEL_TYPE=transformer.
The --profile arg needs to be added to docker compose up command to select between two types of deployment: end-to-end and playback. To deploy the RTLS app, navigate to the metropolis-apps-standalone-deployment/docker-compose folder.
- The end-to-end mode deploys every related module from NVStreamer/VST to UI. The end-to-end mode is for user to fully explore the entire pipeline. To deploy everything end-to-end, use --profile e2e:
  $ docker compose -f foundational/mdx-foundational.yml -f rtls-app/mdx-rtls-app.yml --profile e2e up -d --pull always --build --force-recreate
- The playback mode does not deploy NVStreamer, VST and Perception (DeepStream). Instead, a playback module using saved metadata will be used for streaming input data. The playback mode is for user to quickly investigate the data flow or replay saved data with the most light-weight pipeline. To deploy playback mode, use --profile playback:
  $ docker compose -f foundational/mdx-foundational.yml -f rtls-app/mdx-rtls-app.yml --profile playback up -d --pull always --build --force-recreate
Initialization of some components of the reference application might take up to a minute. After the deployment, the following ports will be used to expose services from the reference application:
- Calibration-Toolkit - 8003
- Default Kafka port - 9092
- Default ZooKeeper ports - 2181
- Elasticsearch and Kibana (ELK) - 9200 and 5601, respectively
- Jupyter Lab - 8888
- NVStreamer - 31000 (for e2e mode only)
- VST - 30000 (for e2e mode only)
- Web-API - 8081
- RTLS Web-UI - 3003

Shutdown

To gracefully stop and shut down all services, run the command with the corresponding profile:

$ docker compose -f foundational/mdx-foundational.yml -f rtls-app/mdx-rtls-app.yml --profile e2e down
$ docker compose -f foundational/mdx-foundational.yml -f rtls-app/mdx-rtls-app.yml --profile playback down

Clean up (Optional)

Clean up data logs and Docker images to ensure a clean re-deployment - highly recommended after any configuration or customization.

Clean up existing data and logs under metropolis-apps-data/data_log/ folder:
```
$ sudo chmod +x cleanup_all_datalog.sh
$ ./cleanup_all_datalog.sh
```

Note

The cleanup_all_datalog.sh script is present inside metropolis-apps-standalone-deployment/docker-compose/ and includes an optional --delete-calibration-data flag. This flag accepts true or false as values, with false being the default.
Camera calibration is a time-consuming process. To preserve the calibration data, run the script without the flag or with the flag set to false, as in ./cleanup_all_datalog.sh --delete-calibration-data false.

Clean up Docker images and cached volumes (selective):

$ docker compose -f foundational/mdx-foundational.yml -f rtls-app/mdx-rtls-app.yml --profile e2e down --rmi all && docker volume rm `docker volume ls -q| grep -v 'deepstream\|calibration'`
$ docker compose -f foundational/mdx-foundational.yml -f rtls-app/mdx-rtls-app.yml --profile playback down --rmi all && docker volume rm `docker volume ls -q| grep -v 'deepstream\|calibration'`

Or, clean up existing Docker images and all cached volumes:

$ docker compose -f foundational/mdx-foundational.yml -f rtls-app/mdx-rtls-app.yml --profile e2e down --volumes --rmi all
$ docker compose -f foundational/mdx-foundational.yml -f rtls-app/mdx-rtls-app.yml --profile playback down --volumes --rmi all
$ docker volume prune

Note

The 1st set of commands preserves the calibration data volume.
It also retains the DeepStream volume to avoid the need to re-build engine files for the perception pipeline, which can be very time-consuming for heavy models (ViT-based, Swin-based, etc.).

Troubleshoot

To troubleshoot issues, developers can start with the FAQs.

Explore the Results

Explore the application features & output data via the following interfaces:

Kibana Dashboards

Reference UI

Kibana Dashboards

You can explore data in Kibana at http://localhost:5601/ (replace with the deployment system IP if opening browser on a remote system). An init container imports the mdx-rtls index pattern already so you should see data from Menu/Discover.

From this sample screen capture, you can view the mdx-rtls data which contains the location info of each globally identified unique object in real-time.

The Kibana dashboard is a powerful tool to visualize data. You can create other index patterns on existing data, e.g. mdx-raw, or create dashboards. You can read more about Kibana in its official documentation.

Reference UI

Once the RTLS app deployment is complete, open a browser and access the reference UI at http://localhost:3003/ (replace with the deployment system IP if opening browser on a remote system).

Note

It will take some time for UI to start running. You can monitor UI deployment progress from docker logs by running: $docker logs mdx-rtls-ui --follow

You should be able to see the UI window as below:

The main window displays the floor plan map of the analyzed space. Each dot moving on the map indicates the location of each globally identified unique object. Each object is labeled with its global id and the live motions are marked with colored trajectories. Camera icons are shown on the map as well to indicate the location and orientation of all the used cameras. The field of view of each camera can be viewed by hovering over the corresponding camera icon. At the bottom of the UI the total number of unique detected objects at the moment will be displayed.

Note

AMR count shown on RTLS UI is 0 since there is no active AMR data flowing in Retail synthetic videos.

For in-depth documentation of the UI, refer to the Real Time Location System UI section.

Components

To further understand this reference application, here is a brief description of the key components.

Media Management

Perception (DeepStream)

Multi-Camera Fusion - RTLS

Web API

Web UI

Media Management

Both NVStreamer and VST are tailored media microservices with functionalities specialized for management & storage of live camera feeds and pre-recorded videos. NVIDIA media microservice group provides various video management services that are critical for end-to-end intelligent video analytics applications. For more details on NVIDIA Media Service refer to the following sections:

Metropolis Media Service (MMS)

Video Storage Toolkit (VST)

NVStreamer

In this reference app, NVStreamer performs as a simulated live camera source where it streams contents from provided videos files as RTSPs. Those RTSPs are pipelined into VST just as live cameras and from there VST performs as the video managements system and interacts with downstream microservices such as Perception and UI.

The key functionalities of NVStreamer and VST in this reference app include:

NVStreamer provides RTSP streaming links from given video files as the input source to VST and considered as simulated live streams.

VST provides RTSP streaming links to Perception (DeepStream) for image processing.

VST creates video clips overlaid with bounding boxes from extracted metadata.

VST provides a browser based UI which you can access from http://localhost:30000/.

There are multiple videos files provided in the metropolis-apps-data folder. RTLS reference app uses the eight videos named as Retail_Synthetic_Cam<id>.mp4 which are present inside metropolis-apps-data/videos/rtls-app/. Those eight videos are captured in a virtual retail store from eight cameras with different viewing angles and are synchronized in time. Here is a sample view from Retail_Synthetic_Cam01:

Note

In the RTLS reference app, we only use 4 cameras if you select to use the transformer model as it requires more GPU resources.

Perception (DeepStream)

The Perception (DeepStream) component consists of a PGIE and single camera tracker pipeline, where the tracker deploys a re-identification model to extract feature vectors for each person. Then, it generates streaming perception metadata which is consumed by Metropolis apps via the Kafka broker. These messages correspond to eight sensors, and act as input data to the RTLS reference app. The messages from the perception microservice are compressed in protobuf format.

The Key contents of the message are:

sensor ID

frame ID and timestamp

detection bounding box

tracking ID

extracted feature vector

For more information on the schema and contents of the sensor metadata, refer to the Protobuf Schema section.

Note

The provided video files are 10 minutes in length. In this reference application, Perception is configured to terminate when the video streaming reaches the end of file and stop sending metadata.
If you want to reprocess the video from the start, you can restart only the perception microservice (DeepStream app) by running docker restart mdx-deepstream.

Multi-Camera Fusion - RTLS

The Multi-Camera Tracking component consumes raw data which are processed into behaviors (trajectories, embeddings, etc.), and clusters the behaviors based on re-identification feature embeddings and spatio-temporal information.

More particularly in this reference app, the RTLS component operates in live mode by consuming raw data from a Kafka topic, manages the behavior state, and then merges the clustering results with existing IDs.

The pipeline workflow and the configuration are discussed in-depth in the Multi-Camera Fusion microservice section.

Web API

The Web API component provides REST APIs for the data produced by Multi-Camera Fusion - RTLS module. The exposed Web API functions are used by the RTLS UI. Such as:

Fetch unique number of object count of different object types and their spatial locations for a given place

Example - The following request fetches the number of unique people seen over a given time range:

curl "http://localhost:8081/tracker/unique-object-count-with-locations?place=building%3DRetail-Store"

The web API component is started with the index.js present in modules/analytics-tracking-web-api, we encourage you to go through the code. For in-depth documentation of the component, refer to the Analytics and Tracking API section.

Web UI

The RTLS reference web UI visualizes the valuable insights generated by the RTLS app and helps in monitoring and management of indoor spaces such as rooms, hallways, entry/exit doors. For in-depth documentation of the component, refer to the Real Time Location System UI section.

Conclusion

Congratulations! You have successfully deployed key microservices and built a Real-Time Location System application.

We encourage you to explore the remaining reference applications provided as part of Metropolis Microservices. Below are additional resources:

Quickstart Setup - Guide to deploy all reference applications in the standalone mode via Docker Compose.
Production Deployment Setup - Guide to deploying Metropolis microservices in a Kubernetes environment.
FAQs- A list of commonly asked questions.