VSS Deployment-Time Configuration Glossary#

VSS supports a variety of configuration options that can be used to customize the behavior of the system.

Note

Support for VILA model and VILA related configurations are scheduled to be deprecated in the future.

VSS Configuration - List of Environment Variables#

Below is a list of environment variables available to configure VSS.

Note

Note for developers:

For details on most useful and frequently used configurations, click on the configuration option in the table below.

For additional details and insights, refer to the source code of VSS packaged inside the VSS container at /opt/nvidia/via/.

For questions, post on the official VSS forum page.

Configuration Option / Environment Variable	Description
FRONTEND_PORT	Port for the frontend service (Docker Compose only).
BACKEND_PORT	Port for the backend service (Docker Compose only).
INSTALL_PROPRIETARY_CODECS	Install additional multimedia packages necessary for live-stream preview, audio and CV-pipelines.
FORCE_SW_AV1_DECODER	Use software decoder for AV1 content.
DISABLE_FRONTEND	Disable the Gradio UI.
NGC_API_KEY	Required for downloading models from NGC.
NVIDIA_API_KEY	NVIDIA Personal Key to use LLM and Rerank and Embeddings NIMs from build.nvidia.com.
VIA_IMAGE	Specifies the container image for the VSS engine component of VSS blueprint.
GRAPH_DB_USERNAME GRAPH_DB_PASSWORD	Credentials for the graph database. This is required for authenticating access to the graph database used by VSS.
MILVUS_DB_HOST MILVUS_DB_PORT	External Milvus DB Connection Parameters.
MODEL_ROOT_DIR	Path to mount in container for locally available model files for Cosmos-Reason1, VILA-1.5, NVILA and CV models. (Docker Compose only)
NGC_MODEL_CACHE	Path for NGC model cache, defaults to Docker volume for Docker Compose deployment and PVC storage for Helm deployments.
EXAMPLE_STREAMS_DIR	Directory for sample streams for Gradio UI.
MILVUS_DATA_DIR	Path for Milvus DB data directory. (Docker Compose only)
ASSET_STORAGE_DIR	Path to store uploaded files and associated data.
VLM_MODEL_TO_USE	Model to use (“cosmos-reason1”, “vila-1.5”, “openai-compat”, “nvila”, “custom”).
MODEL_PATH (VLM)	Path of VLM model.
VLM_BATCH_SIZE	Batch size for VLM, auto-determined if not set.
TRT_LLM_MODE	Precision mode for TRT engine (fp16, int8, int4, int4_awq). This setting affects the performance and precision of the LLM.
TRT_ENGINE_PATH	Path to read/write VILA-1.5 TRT engine.
VILA_LORA_PATH	Path to directory containing LoRA for VILA.
VILA_ENGINE_NGC_RESOURCE	NGC resource for prebuilt VILA-1.5 engine.
OPENAI_API_KEY AZURE_OPENAI_API_KEY VIA_VLM_API_KEY	API Key to use with OpenAI or OpenAI API compatible models.
VIA_VLM_OPENAI_MODEL_DEPLOYMENT_NAME	Deployment name for OpenAI or OpenAI API compatible model.
VIA_VLM_ENDPOINT AZURE_OPENAI_ENDPOINT	Endpoint for OpenAI or OpenAI API compatible model.
OPENAI_API_VERSION AZURE_OPENAI_API_VERSION	API version for OpenAI or OpenAI API compatible model.
VLM_DEFAULT_NUM_FRAMES_PER_CHUNK	Default number of frames per chunk for VLM.
VLM_SYSTEM_PROMPT	Default system prompt for VLM.
CA_RAG_CONFIG	Custom CA-RAG config file.
CONTEXT_MANAGER_CALL_TIMEOUT	Timeout (in seconds) for context manager calls.
DISABLE_CA_RAG	Disable CA-RAG (true/false).
GUARDRAILS_CONFIG	Path to custom guardrails configuration.
DISABLE_GUARDRAILS	Disable guardrails (true/false).
ENABLE_AUDIO	Enable audio transcription using RIVA ASR (true/false).
RIVA_ASR_SERVER_URI	URI of the RIVA ASR service.
RIVA_ASR_GRPC_PORT	GRPC port for RIVA ASR service.
RIVA_ASR_HTTP_PORT	HTTP port for RIVA ASR service, used for readiness status check.
ENABLE_RIVA_SERVER_READINESS_CHECK	Enable Riva server readiness status check on HTTP port of the Riva ASR service (true/false).
RIVA_ASR_SERVER_IS_NIM	Set to false if using non-NIM RIVA deployment.
RIVA_ASR_SERVER_USE_SSL	Enable SSL for RIVA ASR NIM (true/false).
RIVA_ASR_SERVER_API_KEY	API key for RIVA ASR NIM.
RIVA_ASR_SERVER_FUNC_ID	Function ID for RIVA ASR NIM service.
RIVA_ASR_MODEL_NAME	RIVA ASR model name, not needed for NIM-based service.
DISABLE_CV_PIPELINE	Disable CV pipeline (true/false).
GDINO_MODEL_PATH	Path to Gdino ONNX model on host.
CV_PIPELINE_TRACKER_CONFIG	Custom tracker config for CV pipeline.
GDINO_INFERENCE_INTERVAL	Gdino inference interval (default: `1`).
NUM_CV_CHUNKS_PER_GPU	Number of CV pipeline chunks that can run per GPU (default: `2`).
Custom CV Models	Using custom Reidentification and SAM2 models.
ALERT_REVIEW_DEFAULT_VLM_SYSTEM_PROMPT	Default system prompt for VLM in alert review.
ALERT_REVIEW_MEDIA_BASE_DIR	Base directory for alert review media.
NVIDIA_VISIBLE_DEVICES Node Assignment for individual services	Configure GPUs and node assignments for individual services.
ENABLE_VIA_HEALTH_EVAL	Enable VIA health evaluation.
VIA_LOG_DIR	Path of where to write the VIA application logs.
ENABLE_DENSE_CAPTION	Enable dense caption JSON file (true/false).
VSS_LOG_LEVEL	Set the log level for VSS application.
VSS_EXTRA_ARGS	Extra arguments for VSS via_server.py script.
VSS_DISABLE_LIVESTREAM_PREVIEW	Disable live-stream preview.
VSS_SKIP_INPUT_MEDIA_VERIFICATION	Skip input media verification.
NVILA_VIDEO_MAX_TILES	Maximum number of video tiles for NVila.
TRT_LLM_MEM_USAGE_FRACTION	Fraction of GPU memory for TRT LLM (VILA-1.5 and NVILA only).
VLLM_GPU_MEMORY_UTILIZATION	Fraction of GPU memory for VLLM (Cosmos-Reason1 only).
VSS_RTSP_LATENCY	Amount of data to buffer for RTSP (Live Stream) connection.
VSS_RTSP_TIMEOUT	Timeout in milliseconds to try TCP connection for RTP data in case UDP fails.
SAVE_CHUNK_FRAMES_MINIO	Save chunk frames to Minio.
VSS_DISABLE_DB_RESET_ON_INIT	Disable database reset on initialization.

Guide for the List of VSS Environment Variable Configurations#

General Configuration Options#

Backend and Frontend Ports#

Configure the ports for the backend and frontend services.

Docker Compose#

Set FRONTEND_PORT=<PORT> and BACKEND_PORT=<PORT> in the .env file.

Helm#

Not applicable for Helm deployments.

Install Proprietary Codecs#

Install additional multimedia packages necessary for live-stream preview, audio and CV-pipelines.

Docker Compose#

Set INSTALL_PROPRIETARY_CODECS=<true/false> in the .env file.

Helm#

Set INSTALL_PROPRIETARY_CODECS environment variable to <true/false> as shown in Configuration Options. This requires root permissions by setting securityContext as shown in Enabling Audio or Custom Container Image with Codecs Installed.

Force Software Decoding for AV1 Content#

Force software decoding for AV1 streams, for platforms where hardware decoding of AV1 content is not supported.

Docker Compose#

Set FORCE_SW_AV1_DECODER=<true/false> in the .env file.

Helm#

Set FORCE_SW_AV1_DECODER environment variable in the Helm overrides file.

Disable Frontend#

Disable the Gradio UI.

Docker Compose#

Set DISABLE_FRONTEND=<true/false> in the .env file.

Helm#

Set DISABLE_FRONTEND environment variable to <true/false> as shown in Configuration Options.

NGC API Key#

Required for downloading models and containers from NGC. Refer to Obtain NGC API Key.

Docker Compose#

Set NGC_API_KEY=<YOUR_API_KEY> in the .env file.

Helm#

Refer to Create Required Secrets for more information on creating the kubernetes secrets for the NGC API Key.

NVIDIA API Key#

NVIDIA Personal Key to use LLM and Rerank and Embeddings NIMs from build.nvidia.com. This key is essential for accessing NVIDIA’s cloud services and models. Refer to Using NIMs from build.nvidia.com for details on how to create the NVIDIA Personal Key.

Docker Compose#

Set NVIDIA_API_KEY=<YOUR_API_KEY> in the .env file.

Helm#

Refer to Using NIMs from build.nvidia.com.

VSS Container Image#

VSS container image to use. This specifies the container image for the VSS engine component of VSS blueprint.

Docker Compose#

Set VIA_IMAGE=<IMAGE_NAME> in the .env file. Also make sure to login to the container registry in case it requires authentication.

Helm#

Set image.repository and image.tag in the Helm overrides file as shown in Configuration Options.

In case the container image is hosted on a private registry, set imagePullSecrets in the Helm overrides file.

Graph DB Credentials#

Username and password for the graph database. This is required for authenticating access to the graph database used by VSS.

Docker Compose#

Set GRAPH_DB_USERNAME=<USERNAME> and GRAPH_DB_PASSWORD=<PASSWORD> in the .env file.

Helm#

Refer to Create Required Secrets for more information on creating the kubernetes secrets for the graph database credentials.

External Milvus DB Connection Parameters#

Host and port for external Milvus DB.

Docker Compose#

Set MILVUS_DB_HOST=<HOST> and MILVUS_DB_PORT=<PORT> in the .env file.

Helm#

Refer to How can I configure VSS to connect to an existing Milvus DB?.

Storage Configuration#

Model Root Directory#

Path to mount in container when using locally available model files for VILA-1.5, NVILA and CV models. Individual model files and directories must be subdirectories of this.

Docker Compose#

Set MODEL_ROOT_DIR=<PATH> in the .env file. <PATH> is a host directory containing the model files.

Helm#

Not applicable.

NGC Model Cache#

Path for NGC model cache, defaults to Docker volume for Docker Compose deployment and PVC storage for Helm deployments.

Docker Compose#

Set NGC_MODEL_CACHE=<PATH> in the .env file. <PATH> can be a host directory or a Docker volume name.

Helm#

Set to vss-ngc-model-cache-pvc PVC with default storage class.

Example Streams Directory#

Directory for sample streams for Gradio UI. By default, VSS has a few preloaded videos.

For each video file, create a poster image file with name <video.mp4>.poster.jpg. Example command to generate a poster image is:

ffmpeg -i <video.mp4> -vframes 1 <video.mp4>.poster.jpg

The EXAMPLE_STREAMS_DIR contents should look like this:

ls <SAMPLE_STREAMS_DIR_ON_HOST>/
video1.mp4 video1.mp4.poster.jpg video2.mp4 video2.mp4.poster.jpg image1.jpg image2.jpg

Docker Compose#

Set EXAMPLE_STREAMS_DIR=<PATH> in the .env file. <PATH> is a host directory containing example videos.

Helm#

Refer to Configure the Input Video Streams Directory.

Milvus Data Directory#

Path for Milvus DB data directory. By default, uses container internal storage.

Docker Compose#

Set MILVUS_DATA_DIR=<PATH> in the .env file. <PATH> can be a host directory or a Docker volume name.

Helm#

Currently not configurable.

Asset Storage Directory#

Path to store uploaded files. Defaults to container internal storage and will be cleared on container restart.

Docker Compose#

Set ASSET_STORAGE_DIR=<PATH> in the .env file. <PATH> can be a host directory or a Docker volume name.

Helm#

Refer to Configure the Assets Directory.

VLM Configuration#

VLM Model to Use#

Model type to use. Can be one of cosmos-reason1, vila-1.5, nvila, openai-compat or custom.

cosmos-reason1, vila-1.5 and nvila are locally executed models.

openai-compat uses a remote LLM model. Could be gpt-4o or any other OpenAI API-compatible model.

custom uses a custom model which can be executed locally or used as a remote model with REST API. Refer to Other Custom Models for more information.

Docker Compose#

Set VLM_MODEL_TO_USE=<MODEL> in the .env file.

Helm#

Set VLM_MODEL_TO_USE environment variable in the Helm overrides file as shown in Configuration Options.

VLM Model Path#

Path of VLM model for Cosmos-Reason1, VILA-1.5, NVILA and custom models.

Path can be an NGC model resource string (ngc:<org>/<team>/<name>:<tag>), a a git repository URL (git+https://<git-repo-url>/<org>/<repo>.git), or a local file path. For an NGC resource or a git repository, the model will be downloaded automatically.

Example Values:

Cosmos-Reason1 7b FP8: ngc:nim/nvidia/cosmos-reason1-7b:1.1-fp8-dynamic
Cosmos-Reason1 7b FP16: git:https://huggingface.co/nvidia/Cosmos-Reason1-7B
VILA 1.5 40b: ngc:nim/nvidia/vila-1.5-40b:vila-yi-34b-siglip-stage3_1003_video_v8
NVILA 15B Lite research model: git:https://huggingface.co/Efficient-Large-Model/NVILA-15B
NVILA 15B HighRes: ngc:nvidia/tao/nvila-highres:nvila-lite-15b-highres-lita

For custom models, this must be a local file path containing the inference.py file as described in Other Custom Models.

Docker Compose#

Set MODEL_PATH=<PATH/URL> in the .env file. For local model files, this must fall under MODEL_ROOT_DIR.

Helm#

Set MODEL_PATH environment variable in the Helm overrides file as shown in Configuration Options. When using local model files, the directory must be mounted using extraPodVolumes and extraPodVolumeMounts as shown in Local Models (Cosmos-Reason1) or Other Custom Models.

VLM Batch Size#

Batch size for VLM, auto-determined if not set. Applicable only to cosmos-reason1, vila-1.5 and nvila models.

Docker Compose#

Set VLM_BATCH_SIZE=<SIZE> in the .env file.

Helm#

Set VLM_BATCH_SIZE environment variable in the Helm overrides file as shown in Configuration Options.

Refer to Configure the VLM for details on configuring VLMs, including batch size settings.

TRT LLM Mode#

Precision mode for TRT engine (fp16, int8, int4_awq). This setting affects the performance and precision of the LLM decoder part of the VLM. Applicable only to vila-1.5 and nvila models. Defaults to int4_awq for vila-1.5. For nvila, only fp16 is supported.

Docker Compose#

Set TRT_LLM_MODE=<MODE> in the .env file.

Helm#

Set TRT_LLM_MODE environment variable in the Helm overrides file as shown in Configuration Options.

TRT Engine Path#

Path to read/write VILA TRT engine.

Docker Compose#

Set TRT_ENGINE_PATH=<PATH> in the .env file.

Helm#

Not applicable.

VILA LoRA Path#

Path to directory containing LoRA for VILA-1.5.

Docker Compose#

Set VILA_LORA_PATH=<PATH> in the .env file. <PATH> must fall under MODEL_ROOT_DIR.

Helm#

Not applicable.

VILA Prebuilt Engine NGC Resource#

NGC resource for prebuilt VILA-1.5 engine.

The following engines are currently available:

H100 SXM - nvidia/blueprint/vss-vlm-prebuilt-engine:2.4.0-vila-1.5-40b-h100-sxm
L40S - nvidia/blueprint/vss-vlm-prebuilt-engine:2.4.0-vila-1.5-40b-l40s

Docker Compose#

Set VILA_ENGINE_NGC_RESOURCE=<NGC_RESOURCE> in the .env file.

Helm#

Set VILA_ENGINE_NGC_RESOURCE environment variable in the Helm overrides file as shown in Configuration Options.

OpenAI / Azure OpenAI API / VLM API Key#

API Key to use with OpenAI or OpenAI API compatible models.

OPENAI_API_KEY is used for OpenAI models at https://api.openai.com/v1.

AZURE_OPENAI_API_KEY is used for Azure OpenAI endpoints.

VIA_VLM_API_KEY is used for other VLM endpoints that are OpenAI API compatible.

Docker Compose#

Set OPENAI_API_KEY=<API_KEY> or AZURE_OPENAI_API_KEY=<API_KEY> or VIA_VLM_API_KEY=<API_KEY> in the .env file.

Helm#

Create a Kubernetes secret and set OPENAI_API_KEY or AZURE_OPENAI_API_KEY or VIA_VLM_API_KEY environment variable in the Helm overrides file as shown in OpenAI (GPT-4o).

VIA VLM OpenAI Model Deployment Name#

Deployment name for OpenAI or OpenAI API-compatible model. Default is gpt-4o.

Docker Compose#

Set VIA_VLM_OPENAI_MODEL_DEPLOYMENT_NAME=<NAME> in the .env file.

Helm#

Set VIA_VLM_OPENAI_MODEL_DEPLOYMENT_NAME environment variable in the Helm overrides file as shown in Configuration Options.

Refer to OpenAI (GPT-4o) for more information on using OpenAI models with VSS.

Refer to vLLM Served OpenAI API Compatible VLM for an example of using vLLM served OpenAI API-compatible VLM with VSS.

Endpoint for OpenAI / OpenAI API compatible VLM model#

Default is OpenAI (https://api.openai.com/v1) if not specified. Can be set to a custom endpoint. Specify AZURE_OPENAI_ENDPOINT for Azure OpenAI endpoints and VIA_VLM_ENDPOINT for other OpenAI API-compatible endpoints.

Docker Compose#

Set AZURE_OPENAI_ENDPOINT=<ENDPOINT> or VIA_VLM_ENDPOINT=<ENDPOINT> in the .env file. No configuration is required for OpenAI (https://api.openai.com/v1) endpoints.

Helm#

Set AZURE_OPENAI_ENDPOINT or VIA_VLM_ENDPOINT environment variable in the Helm overrides file as shown in Configuration Options.

Refer to Using External Endpoints for guidance on using external endpoints, such as Azure OpenAI, with VSS.

API Version for OpenAI / OpenAI API Compatible VLM Model#

API version for OpenAI or OpenAI API-compatible model. Set AZURE_OPENAI_API_VERSION for Azure OpenAI endpoints, OPENAI_API_VERSION otherwise.

Docker Compose#

Set OPENAI_API_VERSION=<VERSION> or AZURE_OPENAI_API_VERSION=<VERSION> in the .env file.

Helm#

Set OPENAI_API_VERSION or AZURE_OPENAI_API_VERSION environment variable in the Helm overrides file as shown in Configuration Options.

Refer to Using External Endpoints for details on configuring API versions when using OpenAI services with VSS.

VLM Default Number of Frames per Chunk#

The default number of frames to sample from each chunk for VLM when not specified in the API.

If not specified, the default is

For cosmos-reason1 - 20 frames
For vila-1.5 - 8 frames
For nvila - read from the model config file
For openai-compat - 10 frames
For custom - read from the model manifest file

Docker Compose#

Set VLM_DEFAULT_NUM_FRAMES_PER_CHUNK=<NUM_FRAMES> in the .env file.

Helm#

Set VLM_DEFAULT_NUM_FRAMES_PER_CHUNK environment variable in the Helm overrides file as shown in Configuration Options.

VLM System Prompt#

Default system prompt for VLM.

Docker Compose#

Set VLM_SYSTEM_PROMPT=<PROMPT> in the .env file.

Helm#

Set VLM_SYSTEM_PROMPT environment variable in the Helm overrides file as shown in Configuration Options.

Context-aware RAG Configuration#

CA RAG Config#

Custom CA-RAG configuration file. A default configuration is provided with the VSS Blueprint or Docker image.

Most of the parameters can be configured at runtime through the VSS API. However, defaults can be set using the config file when these are not provided in the API.

LLM models and other NIMs can also be configured, such as using a remote LLM model/NIM instead of the default local model and changing the LLM model. When using NIMs from build.nvidia.com, a NVIDIA Personal Key is required.

Refer to CA-RAG Configuration for more information on the CA-RAG configuration file.

Samples for the CA-RAG configuration file for various deployment scenarios are available at NVIDIA-AI-Blueprints/video-search-and-summarization. Look for the config.yaml file in the respective directories.

Docker Compose#

Set CA_RAG_CONFIG=<PATH> in the .env file. <PATH> is the host path for the CA-RAG configuration file.

Helm#

Refer to CA-RAG Configuration for updating the CA-RAG configuration file for Helm deployments. Also refer to Configure the NIMs and Using External Endpoints for more information on configuring LLM models and other NIMs for Helm deployments.

Context Manager (CA-RAG) Call Timeout#

Timeout (in seconds) for context manager calls. Default is 3600 seconds.

Docker Compose#

Add CONTEXT_MANAGER_CALL_TIMEOUT: <TIMEOUT> in env for via-server container in the compose file.

Helm#

Set CONTEXT_MANAGER_CALL_TIMEOUT environment variable in the Helm overrides file as shown in Configuration Options.

Disable CA RAG#

Disable CA-RAG (true/false).

Docker Compose#

Set DISABLE_CA_RAG=<true/false> in the .env file.

Helm#

Set DISABLE_CA_RAG environment variable in the Helm overrides file as shown in Configuration Options.

Guardrails Configuration#

DISABLE_GUARDRAILS#

Disable guardrails check (true/false) of the user prompts.

Docker Compose#

Set DISABLE_GUARDRAILS=<true/false> in the .env file.

Helm#

Set DISABLE_GUARDRAILS environment variable in the Helm overrides file as shown in Configuration Options.

Guardrails Configuration file#

A default Guardrails configuration is provided with the VSS blueprint. User can provide a custom Guardrails configuration file.

Docker Compose#

Set GUARDRAILS_CONFIG=<PATH> in the .env file. <PATH> is the host path for the guardrails configuration directory.

Helm#

Set GUARDRAILS_CONFIG environment variable in the Helm overrides file as shown in Tuning Guardrails.

Audio / ASR Configuration#

Enable Audio#

Enable audio transcription using RIVA ASR (true/false).

Docker Compose#

Set ENABLE_AUDIO=<true/false> in the .env file.

Helm#

Set ENABLE_AUDIO environment variable in the Helm overrides file as shown in Enabling Audio.

Enable Local ASR NIM Deployment#

Enable local ASR NIM deployment as part of blueprint (true/false).

Docker Compose#

Helm#

Set enabled to true for riva subchart along with other parameters as shown in Enabling Audio.

RIVA ASR Server URI#

URI of the RIVA ASR service. (for example, 10.10.10.10). For remote ASR service from build.nvidia.com, set to grpc.nvcf.nvidia.com".

Docker Compose#

Set RIVA_ASR_SERVER_URI=<URI> in the .env file.

Helm#

Set RIVA_ASR_SERVER_URI environment variable in the Helm overrides file as shown in Using Riva ASR as a Remote Service or Using Riva ASR NIM from build.nvidia.com. If not set, the variable is set to use local ASR NIM deployment.

RIVA ASR GRPC Port#

GRPC port for RIVA ASR service.

Docker Compose#

Set RIVA_ASR_GRPC_PORT=<PORT> in the .env file.

Helm#

Set RIVA_ASR_GRPC_PORT environment variable in the Helm overrides file as shown in Using Riva ASR as a Remote Service or Using Riva ASR NIM from build.nvidia.com. If not set, the variable is set to use local ASR NIM deployment.

RIVA ASR HTTP Port#

HTTP port for RIVA ASR service. Set if the service provides readiness status on the HTTP port.

Docker Compose#

Set RIVA_ASR_HTTP_PORT=<PORT> in the .env file.

Helm#

Set RIVA_ASR_HTTP_PORT environment variable in the Helm overrides file.

Enable Riva ASR Server Readiness Check#

Set to true to enable Riva server readiness check on $RIVA_ASR_SERVER_URI:$RIVA_ASR_HTTP_PORT/v1/health/ready. Enable for local Riva ASR NIM based Docker to ensure that the Riva ASR service has started before VSS.

Docker Compose#

Set ENABLE_RIVA_SERVER_READINESS_CHECK=<true/false> in the .env file.

Helm#

Set ENABLE_RIVA_SERVER_READINESS_CHECK environment variable in the Helm overrides file as shown in Enabling Audio.

RIVA ASR Server is NIM#

Set to false if using non-NIM RIVA deployment.

Docker Compose#

Set RIVA_ASR_SERVER_IS_NIM=<true/false> in the .env file.

Helm#

Set RIVA_ASR_SERVER_IS_NIM environment variable in the Helm overrides file as shown in Using Riva ASR as a Remote Service or Using Riva ASR NIM from build.nvidia.com. If not set, the variable is set to use local ASR NIM deployment.

RIVA ASR Server Use SSL#

Enable SSL authorization for RIVA ASR NIM (true/false).

Docker Compose#

Set RIVA_ASR_SERVER_USE_SSL=<true/false> in the .env file.

Helm#

Set RIVA_ASR_SERVER_USE_SSL environment variable in the Helm overrides file as shown in Using Riva ASR as a Remote Service or Using Riva ASR NIM from build.nvidia.com. If not set, the variable is set to use local ASR NIM deployment.

RIVA ASR Server API Key#

API key for accessing RIVA ASR NIM from build.nvidia.com. Refer to Using Riva ASR NIM from build.nvidia.com for steps to generate the API key.

Docker Compose#

Set RIVA_ASR_SERVER_API_KEY=<API_KEY> in the .env file.

Helm#

Set RIVA_ASR_SERVER_API_KEY environment variable in the Helm overrides file as shown in Using Riva ASR NIM from build.nvidia.com.

RIVA ASR Server Function ID#

Function ID to use RIVA ASR NIM service from build.nvidia.com. The function ID can be found on the RIVA NIM API page (for example, https://build.nvidia.com/nvidia/parakeet-ctc-0_6b-asr/api).

Docker Compose#

Set RIVA_ASR_SERVER_FUNC_ID=<FUNCTION_ID> in the .env file.

Helm#

Set RIVA_ASR_SERVER_FUNC_ID environment variable in the Helm overrides file as shown in Using Riva ASR NIM from build.nvidia.com.

RIVA ASR Model Name#

RIVA ASR model name, not needed for NIM-based service.

Docker Compose#

Set RIVA_ASR_MODEL_NAME=<MODEL_NAME> in the .env file.

Helm#

Set RIVA_ASR_MODEL_NAME environment variable in the Helm overrides file as shown in Using Riva ASR as a Remote Service.

CV Pipeline / Set-of-Marks Prompting Configuration#

Disable CV Pipeline#

Disable CV pipeline (true/false). Default is true (disabled).

Docker Compose#

Set DISABLE_CV_PIPELINE=<true/false> in the .env file.

Helm#

Set DISABLE_CV_PIPELINE environment variable in the Helm overrides file as shown in Enabling CV Pipeline: Set-of-Marks (SOM) and Metadata.

GDINO Model Path#

Path to Gdino ONNX model on host.

Docker Compose#

Set GDINO_MODEL_PATH=<PATH> in the .env file. <PATH> is the host path for the Gdino ONNX model. It must be under the <MODEL_ROOT_DIR> directory.

Helm#

Set GDINO_MODEL_PATH environment variable in the Helm overrides file as shown in Customizing the Detector.

CV Pipeline Tracker Config#

Custom tracker config for CV pipeline. If not specified, default tracker config provided with the VSS blueprint is used.

Refer to Customizing the Tracker for more samples for custom tracker configurations.

Docker Compose#

Set CV_PIPELINE_TRACKER_CONFIG=<PATH> in the .env file. <PATH> is the host path for the tracker config file.

Helm#

Refer to Customizing the Tracker for updating the tracker config for Helm deployments.

GDINO Inference Interval#

Gdino inference interval (default: 1). Interval in frames to run Grounding-Dino inference.

Docker Compose#

Set GDINO_INFERENCE_INTERVAL=<INTERVAL> in the .env file.

Helm#

Set GDINO_INFERENCE_INTERVAL environment variable in the Helm overrides file as shown in Enabling CV Pipeline: Set-of-Marks (SOM) and Metadata.

NUM CV Chunks Per GPU#

Number of CV pipeline chunks that can run per GPU (default: 2).

Docker Compose#

Set NUM_CV_CHUNKS_PER_GPU=<NUM_CHUNKS> in the .env file.

Helm#

Set NUM_CV_CHUNKS_PER_GPU environment variable in the Helm overrides file as shown in Enabling CV Pipeline: Set-of-Marks (SOM) and Metadata.

Custom CV Models#

VSS can be configured to use custom reidentification and SAM2 models.

Docker Compose#

The custom model files must be available under the <MODEL_ROOT_DIR> directory. The correspnding path must also be specified in the custom CV Pipeline Tracker Config file.

Helm#

Refer to Customizing Models in the Tracker.

Event / Alert Review Configuration#

Alert Review Default VLM System Prompt#

Default system prompt for VLM in alert review.

Docker Compose#

Set ALERT_REVIEW_DEFAULT_VLM_SYSTEM_PROMPT=<PROMPT> in the .env file.

Helm#

Set ALERT_REVIEW_DEFAULT_VLM_SYSTEM_PROMPT environment variable in the Helm overrides file as shown in Configuration Options.

Alert Review Media Base Directory#

Base directory for alert review media. When relative paths are specified in the POST /reviewAlert API call, the media is expected to be relative to this directory.

Docker Compose#

Set ALERT_REVIEW_MEDIA_BASE_DIR=<PATH> in the .env file. <PATH> is the host path for the media base directory.

Helm#

Using overrides.yaml example shown below, add extraVolumes and extraVolumeMounts for the VSS chart, mount a PVC or a host path at any specified location in the VSS container. Additionally, set the ALERT_REVIEW_MEDIA_BASE_DIR environment variable to the volume mount path.

vss:
  ...
  env:
    - name: ALERT_REVIEW_MEDIA_BASE_DIR
      value: /tmp/alert-review-media
  extraVolumes:
    - name: alert-review-media
      persistentVolumeClaim:
        claimName: alert-review-media
  extraVolumeMounts:
    - name: alert-review-media
      mountPath: /tmp/alert-review-media
      readOnly: false

GPU / Node Assignment Configuration#

Docker Compose#

For the VSS container, set NVIDIA_VISIBLE_DEVICES=<DEVICES> in the .env file. For the other containers, pass the --gpus flag to the Docker run command as shown in Local Deployment.

Helm#

For Helm deployments, GPUs can be assigned to the various services during deployment by specifying resources.limits.nvidia.com/gpu or by setting NVIDIA_VISIBLE_DEVICES environment variable. Services can be assigned to nodes by setting nodeSelector in the overrides file. Multiple examples are shown in Deploy Using Helm.

Debug Configuration Options#

Enable VIA Health Evaluation#

Docker Compose#

Set ENABLE_VIA_HEALTH_EVAL=<true/false> in the .env file.

Helm#

Set ENABLE_VIA_HEALTH_EVAL environment variable in the Helm overrides file as shown in Configuration Options.

VIA Log Directory#

Path where VIA application logs should be written.

Docker Compose#

Set VIA_LOG_DIR=<PATH> in the .env file. <PATH> is the host path for the log directory.

Helm#

Using extraVolumes and extraVolumeMounts for the VSS chart, mount a PVC or a host path at location /tmp/via-logs in the VSS container.

Enable Dense Caption#

Enable dense caption JSON file generation (true/false).

Docker Compose#

Set ENABLE_DENSE_CAPTION=<true/false> in the .env file.

Helm#

Set ENABLE_DENSE_CAPTION environment variable in the Helm overrides file as shown in Configuration Options.

VSS Container Log Level#

Set the log level for VSS application. Default is info. Can be set to debug, info, warning, error, or critical.

Docker Compose#

Set VSS_LOG_LEVEL=<LEVEL> in the .env file.

Helm#

Set VSS_LOG_LEVEL environment variable in the Helm overrides file as shown in Configuration Options.

VSS Extra Arguments#

Pass extra arguments to VSS via_server.py script. Default is empty.

Docker Compose#

Set VSS_EXTRA_ARGS=<ARGS> in the .env file.

Helm#

Set VSS_EXTRA_ARGS environment variable in the Helm overrides file as shown in Configuration Options.

Disable Live-stream Preview#

Disable live-stream preview. This can be useful because live-stream preview requires video encoding on the server side.

Docker Compose#

Set VSS_DISABLE_LIVESTREAM_PREVIEW=1 in the .env file.

Helm#

Set VSS_DISABLE_LIVESTREAM_PREVIEW=1 environment variable in the Helm overrides file as shown in Configuration Options.

Skip Input Media Verification#

Skip input media verification. By default, VSS will verify if the user provided file or RTSP URL contains a video stream during the upload file or add live stream API call. This can be disabled by setting VSS_SKIP_INPUT_MEDIA_VERIFICATION environment variable to 1.

Docker Compose#

Set VSS_SKIP_INPUT_MEDIA_VERIFICATION=1 in the .env file.

Helm#

Set VSS_SKIP_INPUT_MEDIA_VERIFICATION=1 environment variable in the Helm overrides file as shown in Configuration Options.

Save Chunk Frames to Minio#

Save chunk frames to Minio.

Docker Compose#

Set SAVE_CHUNK_FRAMES_MINIO=<true/false> in the .env file.

Helm#

Set SAVE_CHUNK_FRAMES_MINIO environment variable in the Helm overrides file as shown in Configuration Options.

Disable Database Reset on Initialization#

Disable database reset on initialization.

Docker Compose#

Set VSS_DISABLE_DB_RESET_ON_INIT=<true/false> in the .env file.

Helm#

Set VSS_DISABLE_DB_RESET_ON_INIT environment variable in the Helm overrides file as shown in Configuration Options.

Advanced Configuration Options#

NVILA Video Max Tiles#

Maximum number of video tiles for NVila.

Docker Compose#

Set NVILA_VIDEO_MAX_TILES=<TILES> in the .env file.

Helm#

Set NVILA_VIDEO_MAX_TILES environment variable in the Helm overrides file as shown in Configuration Options.

TRT LLM Memory Usage Fraction#

Fraction of GPU memory for TRT LLM (VILA-1.5 and NVILA only).

Docker Compose#

Set TRT_LLM_MEM_USAGE_FRACTION=<FRACTION> in the .env file.

Helm#

Set TRT_LLM_MEM_USAGE_FRACTION environment variable in the Helm overrides file as shown in Configuration Options.

VLLM GPU Memory Utilization#

Fraction of GPU memory for VLLM (Cosmos-Reason1 only).

Docker Compose#

Set VLLM_GPU_MEMORY_UTILIZATION=<FRACTION> in the .env file.

Helm#

Set VLLM_GPU_MEMORY_UTILIZATION environment variable in the Helm overrides file as shown in Configuration Options.

RTSP Latency#

Amount of data to buffer in milliseconds. Default is 2000 ms if not specified.

Docker Compose#

Set VSS_RTSP_LATENCY=<LATENCY> in the .env file.

Helm#

Set VSS_RTSP_LATENCY environment variable in the Helm overrides file as shown in Configuration Options.

RTSP Timeout#

Timeout in milliseconds to try TCP connection for RTP data in case UDP fails. Default is 2000 ms if not specified.

Docker Compose#

Set VSS_RTSP_TIMEOUT=<TIMEOUT> in the .env file.

Helm#

Set VSS_RTSP_TIMEOUT environment variable in the Helm overrides file as shown in Configuration Options.