Deploy Using Docker Compose ARM#
NVIDIA Jetson Thor supports the following deployment scenarios:
Deployment Scenario |
VLM |
LLM (Llama 3.1 70B) |
Embedding (llama-3.2-nv-embedqa-1b-v2) |
Reranker (llama-3.2-nv-rerankqa-1b-v2) |
CV |
Audio |
---|---|---|---|---|---|---|
Remote (OpenAI gpt-4o) |
Remote |
Remote |
Remote |
Local |
Remote |
|
Local (Cosmos-Reason1-7B) |
Remote |
Remote |
Remote |
Local |
Remote |
|
Local (Cosmos-Reason1-7B) |
NA |
NA |
NA |
NA |
NA |
Note
All Docker commands on this page must be run without sudo. Ensure you can run Docker without
sudo
by following the Docker post installation steps. Usingsudo
can break the way environment variables are passed into the container.Refer to Prerequisites (NVIDIA Jetson Thor) for the necessary prerequisites.
To get started, clone the GitHub repository to get the Docker Compose samples:
git clone https://github.com/NVIDIA-AI-Blueprints/video-search-and-summarization.git
cd video-search-and-summarization/deploy/docker
Run the cache cleaner script:
# In another terminal, start the cache cleaner script.
# Alternatively, append " &" to the end of the command to run it in the background.
sudo sh <video-search-and-summarization>/deploy/scripts/sys_cache_cleaner.sh
Log into NGC so all containers are accessible.
docker login nvcr.io
Then supply your NGC API Key.
Username: $oauthtoken
Password: <PASTE_API_KEY_HERE>
Follow one of the sections below to launch VSS with Docker Compose on Thor.
Note
For a list of all configuration options, refer to VSS Deployment-Time Configuration Glossary.
Remote Deployment#
The remote_vlm_deployment
folder, contains an example of how to launch VSS using remote endpoints for the VLM, LLM, Embedding and Reranker models. This allows VSS to run with minimal hardware requirements. A modern GPU with at least 8GB
of VRAM is recommended.
To run this deployment, get an NVIDIA API key from build.nvidia.com and an OpenAI API Key to use GPT-4o as the remote VLM. Any OpenAI compatible VLM can also be used for this.
cd remote_vlm_deployment
If you look in the config.yaml
file, you will observe that the LLM, Reranker and Embedding model configurations have been set to use remote endpoints from build.nvidia.com
.
Inside the remote_vlm_deployment
folder, edit the .env
file and populate the NVIDIA_API_KEY
, NGC_API_KEY
and OPENAI_API_KEY
fields. Optionally, the VIA_VLM_OPENAI_MODEL_DEPLOYMENT_NAME
can be adjusted to use a different model.
The VIA_VLM_ENDPOINT
can also be adjusted to point to any OpenAI compatible VLM. Optionally, to enable the CV pipeline, set DISABLE_CV_PIPELINE=false
and INSTALL_PROPRIETARY_CODECS=true
. For example:
#.env file
NVIDIA_API_KEY=nvapi-***
OPENAI_API_KEY=def456***
NGC_API_KEY=abc123***
#VIA_VLM_ENDPOINT=http://192.168.1.100:8000 # Optional
#VIA_VLM_OPENAI_MODEL_DEPLOYMENT_NAME=gpt-4o # Optional
DISABLE_CV_PIPELINE=true # Set to false to enable CV
INSTALL_PROPRIETARY_CODECS=false # Set to true to enable CV
.
.
.
Warning
The .env file will store your private API keys in plain text. Ensure proper security and permission settings are in place for this file or use a secrets manager to pass API key environment variables in a production environment.
After setting your API keys in the .env
file, you can launch VSS with Docker Compose.
docker compose up
After VSS has loaded you can access the UI at port 9100.
Stopping the Deployment#
To stop any of the Docker Compose deployments, run the following command from the deployment directory:
docker compose down
This will stop and remove all containers created by Docker Compose.
Hybrid Deployment#
The remote_llm_deployment
folder, contains an example of how to launch VSS using the local built-in VLM and remote endpoints for the LLM, Embedding and ReRanker models.
This requires at least 40GB of VRAM to load the built-in Cosmos-Reason1 7B model. This can be run on systems with 1xL40s, 1xA100 80GB, 1xH100, 1xB200 or 1xRTX PRO 6000 Blackwell SE GPUs.
cd remote_llm_deployment
If you look in the config.yaml
file, you will observe that the LLM, Reranker and Embedding model configurations have been set to use remote endpoints from build.nvidia.com.
Inside the remote_llm_deployment
folder, edit the .env
file and populate the NVIDIA_API_KEY
and NGC_API_KEY
fields. You can observe in this file, the VLM is set to Cosmos-Reason1-7B. Optionally, to enable the CV pipeline, set DISABLE_CV_PIPELINE=false
and INSTALL_PROPRIETARY_CODECS=true
.
#.env file
NVIDIA_API_KEY=abc123***
NGC_API_KEY=def456***
DISABLE_CV_PIPELINE=true # Set to false to enable CV
INSTALL_PROPRIETARY_CODECS=false # Set to true to enable CV
.
.
.
After setting your API keys in the .env
file, you can launch VSS with Docker Compose.
docker compose up
After VSS has loaded you can access the UI at port 9100.
Stopping the Deployment#
To stop any of the Docker Compose deployments, run the following command from the deployment directory:
docker compose down
This will stop and remove all containers created by Docker Compose.