Deploy Using Docker Compose ARM#

NVIDIA Jetson Thor supports the following deployment scenarios:

Deployment Scenario

VLM

LLM (Llama 3.1 70B)

Embedding (llama-3.2-nv-embedqa-1b-v2)

Reranker (llama-3.2-nv-rerankqa-1b-v2)

CV

Audio

Remote Deployment

Remote (OpenAI gpt-4o)

Remote

Remote

Remote

Local

Remote

Hybrid Deployment

Local (Cosmos-Reason1-7B)

Remote

Remote

Remote

Local

Remote

VSS Event Reviewer

Local (Cosmos-Reason1-7B)

NA

NA

NA

NA

NA

Note

To get started, clone the GitHub repository to get the Docker Compose samples:

git clone https://github.com/NVIDIA-AI-Blueprints/video-search-and-summarization.git
cd video-search-and-summarization/deploy/docker

Run the cache cleaner script:

# In another terminal, start the cache cleaner script.
# Alternatively, append " &" to the end of the command to run it in the background.
sudo sh <video-search-and-summarization>/deploy/scripts/sys_cache_cleaner.sh

Log into NGC so all containers are accessible.

docker login nvcr.io

Then supply your NGC API Key.

Username: $oauthtoken
Password: <PASTE_API_KEY_HERE>

Follow one of the sections below to launch VSS with Docker Compose on Thor.

Note

For a list of all configuration options, refer to VSS Deployment-Time Configuration Glossary.

Remote Deployment#

The remote_vlm_deployment folder, contains an example of how to launch VSS using remote endpoints for the VLM, LLM, Embedding and Reranker models. This allows VSS to run with minimal hardware requirements. A modern GPU with at least 8GB of VRAM is recommended.

To run this deployment, get an NVIDIA API key from build.nvidia.com and an OpenAI API Key to use GPT-4o as the remote VLM. Any OpenAI compatible VLM can also be used for this.

cd remote_vlm_deployment

If you look in the config.yaml file, you will observe that the LLM, Reranker and Embedding model configurations have been set to use remote endpoints from build.nvidia.com.

Inside the remote_vlm_deployment folder, edit the .env file and populate the NVIDIA_API_KEY, NGC_API_KEY and OPENAI_API_KEY fields. Optionally, the VIA_VLM_OPENAI_MODEL_DEPLOYMENT_NAME can be adjusted to use a different model. The VIA_VLM_ENDPOINT can also be adjusted to point to any OpenAI compatible VLM. Optionally, to enable the CV pipeline, set DISABLE_CV_PIPELINE=false and INSTALL_PROPRIETARY_CODECS=true. For example:

#.env file

NVIDIA_API_KEY=nvapi-***
OPENAI_API_KEY=def456***
NGC_API_KEY=abc123***
#VIA_VLM_ENDPOINT=http://192.168.1.100:8000 # Optional
#VIA_VLM_OPENAI_MODEL_DEPLOYMENT_NAME=gpt-4o # Optional
DISABLE_CV_PIPELINE=true # Set to false to enable CV
INSTALL_PROPRIETARY_CODECS=false # Set to true to enable CV
.
.
.

Warning

The .env file will store your private API keys in plain text. Ensure proper security and permission settings are in place for this file or use a secrets manager to pass API key environment variables in a production environment.

After setting your API keys in the .env file, you can launch VSS with Docker Compose.

docker compose up

After VSS has loaded you can access the UI at port 9100.

Stopping the Deployment#

To stop any of the Docker Compose deployments, run the following command from the deployment directory:

docker compose down

This will stop and remove all containers created by Docker Compose.

Hybrid Deployment#

The remote_llm_deployment folder, contains an example of how to launch VSS using the local built-in VLM and remote endpoints for the LLM, Embedding and ReRanker models. This requires at least 40GB of VRAM to load the built-in Cosmos-Reason1 7B model. This can be run on systems with 1xL40s, 1xA100 80GB, 1xH100, 1xB200 or 1xRTX PRO 6000 Blackwell SE GPUs.

cd remote_llm_deployment

If you look in the config.yaml file, you will observe that the LLM, Reranker and Embedding model configurations have been set to use remote endpoints from build.nvidia.com.

Inside the remote_llm_deployment folder, edit the .env file and populate the NVIDIA_API_KEY and NGC_API_KEY fields. You can observe in this file, the VLM is set to Cosmos-Reason1-7B. Optionally, to enable the CV pipeline, set DISABLE_CV_PIPELINE=false and INSTALL_PROPRIETARY_CODECS=true.

#.env file

NVIDIA_API_KEY=abc123***
NGC_API_KEY=def456***
DISABLE_CV_PIPELINE=true # Set to false to enable CV
INSTALL_PROPRIETARY_CODECS=false # Set to true to enable CV
.
.
.

After setting your API keys in the .env file, you can launch VSS with Docker Compose.

docker compose up

After VSS has loaded you can access the UI at port 9100.

Stopping the Deployment#

To stop any of the Docker Compose deployments, run the following command from the deployment directory:

docker compose down

This will stop and remove all containers created by Docker Compose.