Zero Shot Detection with Jetson Platform Services

Overview

Generative AI vision transformers such as CLIP have made it possible to build zero shot detection models capable of open vocabulary object detection. Meaning, the model is not bounded by a set of pre-defined classes to detect. The objects to detect are configured at runtime by the user.

The Zero Shot Detection AI service, enables quick deployment of generative AI with Jetson Services for open vocabulary detection on video live stream input. The Zero Shot Detection AI service exposes REST API endpoints to control stream input and objects to detect.

API Endpoint

Description

/api/v1/live-stream

Manage live streams the AI service has access to.

/api/v1/detection/classes

Set object classes to be detected

This AI service is provided as a prebuilt docker container that can be launched with docker compose. It is configured through json config files and integrates with the foundation services such as jetson-redis and jetson-ingress. We provide example compose and configuration files for easy deployment in the reference workflow resource.

Zero Shot Detection AI Service Diagram

Service

Required

Notes

jetson-ingress

Required to access the Zero Shot Detection REST APIs through the API Gateway port (30080).

jetson-redis

Required for detection metadata output.

jetson-vst

Recommended to manage RTSP streams that can be used as input to the AI service. Needed if integrating with SDR.

jetson-storage

Recommended to help store container and model but not necessary. The storage required for this service should fit in most Jetson devices by default.

jetson-firewall

Recommended to be used in real deployment to restrict access to ports other than the API gateway port.

jetson-monitoring

Only needed to track system metrics

jetson-sys-monitoring

Only needed to track system metrics

jetson-gpu-monitoring

Only needed to track GPU metrics

jetson-networking

Only needed if using VST and IP cameras with the VLM service

Getting Started

Read through the Prerequisite section carefully before getting started with this example.

Prerequisites

First follow the Quick Start Guide to set up your system with Jetson Platform Services. It is recommended to also follow the Hello World example to get familiar with Jetson Platform Services. Before continuing, bring down any previously launched JPS examples like AI-NVR with the docker compose down command.

The Zero Shot Detection AI service operates on RTSP streams. The RTSP stream can come from any source such as an IP camera, the Video Storage Toolkit (VST) or NVStreamer. The fastest way to get an RTSP stream for testing is to use NVStreamer which can serve video files as an RTSP stream. To learn how to use NVStreamer to make an RTSP stream, see the NVStreamer on Jetson Orin page.

The Zero Shot Detection AI service is provided as a prebuilt docker container that can be launched with docker compose. It is configured through json config files and integrates with Jetson Platform services such as jetson-redis and jetson-ingress.

To get the docker compose and config files, download the Jetson Platform Services resources bundle from NGC or SDK Manager. Once downloaded, find the zero_shot_detection-1.1.0.tar.gz file and place it in your home directory. The following commands will assume the tar file is starting from your home directory.

cd ~/
tar -xvf zero_shot_detection-1.1.0.tar.gz
cd ~/zero_shot_detection/example_1

The Zero Shot Detection AI service will use the jetson-ingress and jetson-redis services. The jetson-ingress service needs to be configured to expose the Zero Shot Detection REST API. Copy the provided default configuration to the appropriate service configuration directory.

sudo cp config/zero_shot_detection-nginx.conf /opt/nvidia/jetson/services/ingress/config

Then restart the foundation services

sudo systemctl start jetson-ingress
sudo systemctl start jetson-redis

Note

If any of the jetson services were previously launched then use the ‘restart’ command instead of ‘start’.

Now deploy the Zero Shot Detection AI Service!

sudo docker compose up -d

To check if all the necessary containers have started up, you can run the following command:

sudo docker ps

The output should look similar to the following image.

Zero Shot Docker PS

Note

The first time this is launched, it will automatically download and optimize the zero shot detection model. This will take some time.

Interact with Zero Shot Detection Service

  1. Control Stream Input via REST APIs

You can start by adding an RTSP stream for the Zero Shot Detection model to use with the following curl command. This will use the POST method on the live-stream endpoint.

  • Replace 0.0.0.0 with your Jetson IP and replace the rtsp link with your RTSP link.

curl --location 'http://0.0.0.0:5010/api/v1/live-stream' \
--header 'Content-Type: application/json' \
--data '{
"liveStreamUrl": "rtsp://0.0.0.0:31554/nvstream/root/store/nvstreamer_videos/video.mp4"
}'

Note

In addition to the curl commands, the REST APIs can also be tested directly through the API documentation page that is served at http://0.0.0.0:5010/docs when the Zero Shot Detection service is brought up.

This request will return a unique stream ID that is used later to set detection classes and remove the stream.

{
    "id": "a782e200-eb48-4d17-a1b9-5ac0696217f7"
}

You can also use the GET method on the live-stream endpoint to list the added streams and their IDs:

curl --location 'http://0.0.0.0:5010/api/v1/live-stream'
  1. Add Detection Classes

The zero shot detection model is capable of updating its detection classes at runtime. This endpoint accepts a list of objects to detect and an associated threshold value. The threshold is the sensitivity of the detection. Higher values will reduce false positives. Lower values will increase false positives. Values between 0.1-2.0 generally work well. Once the detection classes are set, the model will continuously evaluate on the input stream.

Currently this service only supports 1 stream but in the future this API will allow for multi-stream support.

curl --location 'http://0.0.0.0:5010/api/v1/detection/classes' \
--header 'Content-Type: application/json' \
--data '{
    "objects": ["a zebra", "an elephant", "a giraffe"],
    "thresholds": [0.03, 0.02, 0.01],
    "id": "a782e200-eb48-4d17-a1b9-5ac0696217f7"
}'
  1. View RTSP Stream Output

Once a stream is added, it will be passed through to the output RTSP stream. You can view this stream at rtsp://0.0.0.0:5011/out. Once a query or alert is added, we can view the VLM responses on this output stream.

  1. View Detection Metadata Output

The redis-cli can be used from the command line to view the metadata output.

redis-cli XREAD COUNT 1 BLOCK 5000 STREAMS owl $

Note that if the system is rebooted while the container is running, it will automatically come back up but the added streams and detection classes will not persist. They will need to be added again. For streams to persist across reboots, the AI service can be combined with SDR as shown in the Zero Shot Detection Workflow page.

  1. Shut Down

To shut down the example you can first remove the stream using a DELETE method on the live-stream endpoint. Note the stream ID is added to the URL path for this.

curl --location --request DELETE 'http://0.0.0.0:5010/api/v1/live-stream/a782e200-eb48-4d17-a1b9-5ac0696217f7'

Then from the same folder as the compose.yaml file used to launch the example run

sudo docker compose down

To summarize, this section covered how to launch the Zero Shot Detection AI service and then interact with it through the REST APIs and view the RTSP output and Redis output.

Here is a summary of useful addresses when interacting with the Zero Shot Detection service.

Access points for the Zero Shot Detection Service

Name

Local URI

API Gateway URI

Description

REST API Docs

http://0.0.0.0:5010/docs

http://0.0.0.0:30080/zero_shot_detection/docs

Documentation for Zero Shot Detection AI Service REST API

REST API

http://0.0.0.0:5010/api/v1/

http://0.0.0.0:30080/zero_shot_detection/api/v1

Control REST API for Zero Shot Detection AI Service

RTSP output

rtsp://0.0.0.0:5011/out

Detection overlay output

Configuration

All configuration files can be found under the ~/zero_shot_detection/example_1/config directory. The configuration options available are split into two categories.

  • Zero Shot Detection Service configuration

    • main_config.json

  • Foundation Service Configuration

    • zero_shot_detection-nginx.conf

main_config.json

The main_config.json configures the streaming pipeline that will evaluate the stream input with NanoOWL and expose a REST API to configure the stream input and detection classes.

{
    "api_server_port": 5010,
    "stream_output": "rtsp://0.0.0.0:5011/out",
    "redis_host": "0.0.0.0",
    "redis_port": 6379,
    "redis_stream": "owl",
    "redis_output_interval": 60,
    "log_level": "INFO"
}

Key

Value Type

Value Example

Description

Notes

“api_server_port”

int

5010

port the main pipeline will expose its REST APIs for stream and model control

“stream_output”

int

5012

output URI of the RTSP stream generated by the AI service.

“redis_host”

str

“0.0.0.0”

host for redis server

If running redis locally with the jetson-redis service, set this value to “0.0.0.0”.

“redis_port”

str

6379

the redis server port to connect to

If running redis locally with the jetson-redis service, set this value to 6379.

“redis_stream”

str

“owl”

The redis stream name to output metadata to

“redis_output_interval”

str

60

The metadata from every Nth frame will be output to redis based on this value.

“log_level”

str

“INFO”

Python based log level

Supported Values: [“DEBUG”, “INFO”, “WARNING”, “ERROR”, “CRITICAL”]

Foundation Service Configuration

  • zero-shot-detection-nginx.conf

The zero-shot-detection-nginx.conf is used to configure the jetson-ingress service (API Gateway) to route HTTP traffic to the Zero Shot Detection AI service. The documentation for jetson-ingress is available in API Gateway (Ingress)

Workflow Integration

The zero shot detection service can be integrated with other parts of Jetson Platform services to build a full end to end workflow for zero shot detection. To read more about the full workflow and how the Zero Shot Detection service can integrate with VST, SDR, Monitoring, and Redis go to the Zero Shot Detection Workflow page.

Further Reading

To understand more about the Zero Shot Detection AI service, view the open source code on the jetson-platform-services repository.

Support for zero shot detection on Jetson comes from the NanoOWL project.

Benchmarks and other generative AI examples on Jetson can be found on at the Jetson AI Lab.