VSS Observability#

This guide provides step-by-step instructions for running the VSS (Video Summarization Service) pipeline with health evaluation enabled to collect performance metrics and logs.

Overview#

The VSS pipeline provides comprehensive performance monitoring through two complementary systems that collect detailed metrics, logs, and diagnostic information during video processing:

VIA Health Report

Collects comprehensive performance metrics and diagnostic files including:

Component-level latency measurements and timing breakdowns
GPU usage and memory consumption over time with visualizations
NVDEC usage for video decode operations
VLM processing outputs and summaries for debugging
System configuration and pipeline state information
Generated plots and CSV files for analysis

OpenTelemetry (OTEL) Tracing

Provides distributed tracing with granular latency & timing information for:

Context-aware RAG pipeline operations (document processing, graph operations, vector retrieval)
VLM (Vision-Language Model) pipeline visual processing and inference
Individual component execution times and dependencies
Request flow visualization through Jaeger UI
Cross-service timing correlation and bottleneck identification

Both systems can be used independently or together for comprehensive performance analysis.

Instructions for OTEL Tracing and Monitoring#

1. Set Up and Test Deployment of VSS Blueprint#

Refer to the VSS Pipeline Deployment Guide for instructions on using Docker Compose to deploy the VSS pipeline.

2. (Optional) Set Up Services for Jaeger UI#

To run VSS with a UI to graph OpenTelemetry traces, you can run with the Docker Compose profile perf-profiling. This is not required if you simply want to collect the logs and traces in static files.

As an example, for local deployments use the Local Deployments Compose File before running VSS as described in Deploy VSS section of the Deployment Guide.

To use this profile and and the UI, also make a new file in the same directory as your compose.yaml file called otel-collector-config.yaml and add the following:

receivers:
otlp:
   protocols:
      grpc:
      endpoint: 0.0.0.0:4317
      http:
      endpoint: 0.0.0.0:4318

processors:
batch:

exporters:
# Export to console for immediate viewing
debug:
   verbosity: detailed

# Export traces to Jaeger via OTLP gRPC
otlp/jaeger:
   endpoint: jaeger:4317
   tls:
      insecure: true

service:
pipelines:
   traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [debug, otlp/jaeger]

   metrics:
      receivers: [otlp]
      processors: [batch]
      exporters: [debug]

Adjust ports as necessary.

3. Enable Evaluation Environment Variables#

Users can enable VSS health evaluation reports and OTEL tracing.

Enable VSS Health evaluation reports (log files written to the container):

export ENABLE_VIA_HEALTH_EVAL=true # required for health evaluation report and performance metrics

Enable VSS OTEL tracing:

export VIA_ENABLE_OTEL=true
export VIA_CTX_RAG_ENABLE_OTEL=true
export VIA_OTEL_EXPORTER=otlp # if you want to view OTEL traces in the Jaeger UI, otherwise set 'console'
export VIA_OTEL_ENDPOINT=http://otel-collector:4318 # if exporter is set to 'otlp'

To run compose with the perf-profiling services for the Jaeger UI and otel-collector, set the following environment variable:

export COMPOSE_PROFILES=perf-profiling

See For Helm Deployments section below to set these when using Helm.

3. Run VSS#

Start your VSS Docker container/server, and do a warm-up run to ensure the system is properly initialized
Upload your target video or image
Execute the summarize() function that you want to profile (through the UI or API)

4. (Optional) Viewing Traces#

Access the Jaeger UI while VSS is running to view the collected traces if otlp was the selected exporter:

# Jaeger UI is available at:
http://localhost:16686

Traces will also be dumped with the other health evaluation logs regardless of whether you set up the Jaeger UI, see Collecting VIA Health Reports and Log Files section on how to access them below.

Key Traces to Monitor#

These traces should be available to monitor:

Trace Name	Description
VIA Pipeline End-to-End	Complete request processing time
VLM Pipeline Latency	Total time for vision-language model processing
Total Decode Latency	Aggregate video decode time across chunks
Decode - Chunk X	Individual chunk decode times for bottleneck identification
Total VLM Latency	Aggregate VLM processing time
VLM NIM Inference - Chunk X	Per-chunk VLM inference times
Context Aware RAG Latency	Total RAG processing time
context_manager/add_doc	RAG Document addition operations
ASR NIM Inference - Chunk X	Per-chunk ASR inference times (if ASR is enabled)

Collecting VIA Health Reports and Log Files#

Enable VIA Health Evaluation#

To collect the health evaluation reports and log files, ensure you set this environment variable before running the VSS pipeline.

export ENABLE_VIA_HEALTH_EVAL=true

Collect Health Report and Logs#

After running summarization queries while the container is running, collect the generated log files. The system generates files in the following location within the container:

/tmp/via-logs/files_requestid_*

Each request generates files with a unique request ID format like: 6236eeb8-be1b-4cb4-8d43-2d2d18f04613

Follow these steps to copy log files from the Docker container directly into an organized folder structure on your host system:

# Step 1: Find your container name/ID
docker ps

# Step 2: Set your request ID
container_name="<your_vss_container_name>"   # Typically via-engine-local-<your_username>

# Step 3: Make directory on your host system to store logs
mkdir -p ~/vss-performance-logs

# Step 4: Copy entire logs directory from container
docker cp $container_name:/tmp/via-logs ~/vss-performance-logs

The VIA Health Report system collects comprehensive performance metrics, GPU usage data, and diagnostic information when enabled, and writes to the /tmp/via-logs/ directory inside the running VSS container.

Collected Files Descriptions#

The following files are generated for each request with unique request ID <unique_request_id> when ENABLE_VIA_HEALTH_EVAL is enabled:

Configuration and Summary Files:

via_health_summary_<unique_request_id>.json - VSS configuration and component latencies
vlm_testdata_<unique_request_id>.txt - Dense caption output from VLM processing
summ_testdata_<unique_request_id>.txt - Summary output from /summarize API call

GPU Usage Files:

via_gpu_usage_<unique_request_id>.csv - GPU usage and memory values over time
via_plot_gpu_<unique_request_id>.png - GPU usage plot visualization
via_plot_gpu_mem_<unique_request_id>.png - GPU memory usage plot visualization

Video Decode Files:

via_nvdec_usage_<unique_request_id>.csv - NVDEC usage for video decode over time
via_plot_nvdec_<unique_request_id>.png - NVDEC usage plot visualization

Additional Log Files:

via_engine.log - VSS engine logs pertaining to Ingestion pipeline
via_engine.log.<date> - Rotated VSS engine logs with date suffix
via_ctx_rag.log - VSS CA RAG logs pertaining to Retrieval pipeline

OTEL Traces:

via_otel_traces_<unique_request_id>.json - OTEL traces dumped in JSON format
via_otel_traces_<unique_request_id>.txt - OTEL traces in text format

Health Summary Metrics#

The via_health_summary_<unique_request_id>.json file contains comprehensive performance metrics:

System Configuration

Metric Name	Description	Example Value
num_gpus	Number of GPUs available	3
gpu_names	List of GPU model names	[“NVIDIA H100 PCIe”]
vlm_model_name	Vision-language model used	“nvila”
vlm_batch_size	VLM processing batch size	128

Video Processing Parameters

Metric Name	Description	Example Value
input_video_duration	Total video length in seconds	180.0
chunk_size	Duration of each processing chunk	30
chunk_overlap_duration	Overlap between chunks in seconds	0
num_chunks	Total number of chunks created	6.0

Timing Metrics (seconds)

Metric Name	Description	Example Value
e2e_latency	Complete end-to-end processing time	20.33
vlm_pipeline_latency	Total VLM pipeline processing time	10.46
ca_rag_latency	Context-aware RAG processing time	9.29
decode_latency	Video decode time across all chunks	8.94
vlm_latency	Pure VLM inference time	8.42
req_start_time	Request start timestamp (Unix epoch)	1751909395.34

Token Usage & Processing

Metric Name	Description	Example Value
total_vlm_input_tokens	Total tokens sent to VLM	12012
total_vlm_output_tokens	Total tokens generated by VLM	223
pending_add_doc_latency	Time waiting for document additions	0
pending_doc_start_time	Document processing start time	0
pending_doc_end_time	Document processing end time	0

Generated Files

Metric Name	Description	Example Value
health_graph_paths	List of generated CSV files	[“/tmp/via-logs/…csv”]
health_graph_plot_paths	List of generated PNG visualization files	[“/tmp/via-logs/…png”]

Per-Chunk Detailed Timing

The all_times array contains detailed timing for each chunk:

Metric Name	Description	Example Value
chunk_id	Chunk identifier (0-based)	0, 1, 2, …
decode_start/end	Chunk decode timing timestamps	1751909395.36/1751909397.38
vlm_start/end	VLM processing timing for chunk	1751909397.38/1751909398.50
add_doc_start/end	Document addition timing	1751909398.50/1751909398.50
vlm_stats	Token counts for specific chunk	{“input_tokens”: 2002, “output_tokens”: 22}

Sample GPU Usage Plots

Troubleshooting#

Container Access Issues#

Ensure the Docker container is running: docker ps
Check container logs for errors: docker logs <container_name>
Verify container has sufficient disk space for log files

Missing Log Files#

Confirm ENABLE_VIA_HEALTH_EVAL=true was set before running
Check that the summarize() function completed successfully in the console logs
Verify the correct path: /tmp/via-logs/ within the container

Performance Impact#

If health evaluation significantly impacts performance, consider running separate dedicated profiling sessions
Monitor system resources during evaluation-enabled runs

For additional support or questions about VSS pipeline performance evaluation, refer to the main VSS documentation or contact the development team.

For Helm Deployments#

Ensure the following environment variables are set in the overrides file:

vss:
  applicationSpecs:
    vss-deployment:
      containers:
        vss:
          env:
            - name: ENABLE_VIA_HEALTH_EVAL
            value: "true"

            - name: VIA_ENABLE_OTEL
            value: "true" # if you want to have OTEL traces in the logs as well

            - name: VIA_OTEL_EXPORTER
            value: "otlp"

            - name: VIA_OTEL_ENDPOINT
            value: "http://otel-collector:4318"

Note that while the otel-collector and Jaeger services won’t be deployed with Helm, you can set up your own services to use the endpoint.

The health evaluation logs will need to be copied using kubectl or helm commands.