VSS Observability#

This guide provides step-by-step instructions for running the VSS (Video Summarization Service) pipeline with health evaluation enabled to collect performance metrics and logs.

Overview#

The VSS pipeline provides comprehensive performance monitoring through two complementary systems that collect detailed metrics, logs, and diagnostic information during video processing:

VIA Health Report

Collects comprehensive performance metrics and diagnostic files including:

  • Component-level latency measurements and timing breakdowns

  • GPU usage and memory consumption over time with visualizations

  • NVDEC usage for video decode operations

  • VLM processing outputs and summaries for debugging

  • System configuration and pipeline state information

  • Generated plots and CSV files for analysis

OpenTelemetry (OTEL) Tracing

Provides distributed tracing with granular latency & timing information for:

  • Context-aware RAG pipeline operations (document processing, graph operations, vector retrieval)

  • VLM (Vision-Language Model) pipeline visual processing and inference

  • Individual component execution times and dependencies

  • Request flow visualization through Jaeger UI

  • Cross-service timing correlation and bottleneck identification

Both systems can be used independently or together for comprehensive performance analysis.

Instructions for OTEL Tracing and Monitoring#

1. Set Up and Test Deployment of VSS Blueprint#

Refer to the VSS Pipeline Deployment Guide for instructions on using Docker Compose to deploy the VSS pipeline.

2. (Optional) Set Up Services for Jaeger UI#

To run VSS with a UI to graph OpenTelemetry traces, you can run with the Docker Compose profile perf-profiling. This is not required if you simply want to collect the logs and traces in static files.

As an example, for local deployments use the Local Deployments Compose File before running VSS as described in Deploy VSS section of the Deployment Guide.

To use this profile and and the UI, also make a new file in the same directory as your compose.yaml file called otel-collector-config.yaml and add the following:

receivers:
otlp:
   protocols:
      grpc:
      endpoint: 0.0.0.0:4317
      http:
      endpoint: 0.0.0.0:4318

processors:
batch:

exporters:
# Export to console for immediate viewing
debug:
   verbosity: detailed

# Export traces to Jaeger via OTLP gRPC
otlp/jaeger:
   endpoint: jaeger:4317
   tls:
      insecure: true

service:
pipelines:
   traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [debug, otlp/jaeger]

   metrics:
      receivers: [otlp]
      processors: [batch]
      exporters: [debug]

Adjust ports as necessary.

3. Enable Evaluation Environment Variables#

Users can enable VSS health evaluation reports and OTEL tracing.

Enable VSS Health evaluation reports (log files written to the container):

export ENABLE_VIA_HEALTH_EVAL=true # required for health evaluation report and performance metrics

Enable VSS OTEL tracing:

export VIA_ENABLE_OTEL=true
export VIA_CTX_RAG_ENABLE_OTEL=true
export VIA_OTEL_EXPORTER=otlp # if you want to view OTEL traces in the Jaeger UI, otherwise set 'console'
export VIA_OTEL_ENDPOINT=http://otel-collector:4318 # if exporter is set to 'otlp'

To run compose with the perf-profiling services for the Jaeger UI and otel-collector, set the following environment variable:

export COMPOSE_PROFILES=perf-profiling

See For Helm Deployments section below to set these when using Helm.

3. Run VSS#

  1. Start your VSS Docker container/server, and do a warm-up run to ensure the system is properly initialized

  2. Upload your target video or image

  3. Execute the summarize() function that you want to profile (through the UI or API)

4. (Optional) Viewing Traces#

Access the Jaeger UI while VSS is running to view the collected traces if otlp was the selected exporter:

# Jaeger UI is available at:
http://localhost:16686

Traces will also be dumped with the other health evaluation logs regardless of whether you set up the Jaeger UI, see Collecting VIA Health Reports and Log Files section on how to access them below.

Key Traces to Monitor#

These traces should be available to monitor:

Trace Name

Description

VIA Pipeline End-to-End

Complete request processing time

VLM Pipeline Latency

Total time for vision-language model processing

Total Decode Latency

Aggregate video decode time across chunks

Decode - Chunk X

Individual chunk decode times for bottleneck identification

Total VLM Latency

Aggregate VLM processing time

VLM NIM Inference - Chunk X

Per-chunk VLM inference times

Context Aware RAG Latency

Total RAG processing time

context_manager/add_doc

RAG Document addition operations

ASR NIM Inference - Chunk X

Per-chunk ASR inference times (if ASR is enabled)

Collecting VIA Health Reports and Log Files#

Enable VIA Health Evaluation#

To collect the health evaluation reports and log files, ensure you set this environment variable before running the VSS pipeline.

export ENABLE_VIA_HEALTH_EVAL=true

Collect Health Report and Logs#

After running summarization queries while the container is running, collect the generated log files. The system generates files in the following location within the container:

/tmp/via-logs/files_requestid_*

Each request generates files with a unique request ID format like: 6236eeb8-be1b-4cb4-8d43-2d2d18f04613

Follow these steps to copy log files from the Docker container directly into an organized folder structure on your host system:

# Step 1: Find your container name/ID
docker ps

# Step 2: Set your request ID
container_name="<your_vss_container_name>"   # Typically via-engine-local-<your_username>

# Step 3: Make directory on your host system to store logs
mkdir -p ~/vss-performance-logs

# Step 4: Copy entire logs directory from container
docker cp $container_name:/tmp/via-logs ~/vss-performance-logs

The VIA Health Report system collects comprehensive performance metrics, GPU usage data, and diagnostic information when enabled, and writes to the /tmp/via-logs/ directory inside the running VSS container.

Collected Files Descriptions#

The following files are generated for each request with unique request ID <unique_request_id> when ENABLE_VIA_HEALTH_EVAL is enabled:

Configuration and Summary Files:

  • via_health_summary_<unique_request_id>.json - VSS configuration and component latencies

  • vlm_testdata_<unique_request_id>.txt - Dense caption output from VLM processing

  • summ_testdata_<unique_request_id>.txt - Summary output from /summarize API call

GPU Usage Files:

  • via_gpu_usage_<unique_request_id>.csv - GPU usage and memory values over time

  • via_plot_gpu_<unique_request_id>.png - GPU usage plot visualization

  • via_plot_gpu_mem_<unique_request_id>.png - GPU memory usage plot visualization

Video Decode Files:

  • via_nvdec_usage_<unique_request_id>.csv - NVDEC usage for video decode over time

  • via_plot_nvdec_<unique_request_id>.png - NVDEC usage plot visualization

Additional Log Files:

  • via_engine.log - VSS engine logs pertaining to Ingestion pipeline

  • via_engine.log.<date> - Rotated VSS engine logs with date suffix

  • via_ctx_rag.log - VSS CA RAG logs pertaining to Retrieval pipeline

OTEL Traces:

  • via_otel_traces_<unique_request_id>.json - OTEL traces dumped in JSON format

  • via_otel_traces_<unique_request_id>.txt - OTEL traces in text format

Health Summary Metrics#

The via_health_summary_<unique_request_id>.json file contains comprehensive performance metrics:

System Configuration

Metric Name

Description

Example Value

num_gpus

Number of GPUs available

3

gpu_names

List of GPU model names

[“NVIDIA H100 PCIe”]

vlm_model_name

Vision-language model used

“nvila”

vlm_batch_size

VLM processing batch size

128

Video Processing Parameters

Metric Name

Description

Example Value

input_video_duration

Total video length in seconds

180.0

chunk_size

Duration of each processing chunk

30

chunk_overlap_duration

Overlap between chunks in seconds

0

num_chunks

Total number of chunks created

6.0

Timing Metrics (seconds)

Metric Name

Description

Example Value

e2e_latency

Complete end-to-end processing time

20.33

vlm_pipeline_latency

Total VLM pipeline processing time

10.46

ca_rag_latency

Context-aware RAG processing time

9.29

decode_latency

Video decode time across all chunks

8.94

vlm_latency

Pure VLM inference time

8.42

req_start_time

Request start timestamp (Unix epoch)

1751909395.34

Token Usage & Processing

Metric Name

Description

Example Value

total_vlm_input_tokens

Total tokens sent to VLM

12012

total_vlm_output_tokens

Total tokens generated by VLM

223

pending_add_doc_latency

Time waiting for document additions

0

pending_doc_start_time

Document processing start time

0

pending_doc_end_time

Document processing end time

0

Generated Files

Metric Name

Description

Example Value

health_graph_paths

List of generated CSV files

[“/tmp/via-logs/…csv”]

health_graph_plot_paths

List of generated PNG visualization files

[“/tmp/via-logs/…png”]

Per-Chunk Detailed Timing

The all_times array contains detailed timing for each chunk:

Metric Name

Description

Example Value

chunk_id

Chunk identifier (0-based)

0, 1, 2, …

decode_start/end

Chunk decode timing timestamps

1751909395.36/1751909397.38

vlm_start/end

VLM processing timing for chunk

1751909397.38/1751909398.50

add_doc_start/end

Document addition timing

1751909398.50/1751909398.50

vlm_stats

Token counts for specific chunk

{“input_tokens”: 2002, “output_tokens”: 22}

Sample GPU Usage Plots

GPU Usage Plots GPU Memory Usage Plots NVDEC Usage Plots

Troubleshooting#

Container Access Issues#

  • Ensure the Docker container is running: docker ps

  • Check container logs for errors: docker logs <container_name>

  • Verify container has sufficient disk space for log files

Missing Log Files#

  • Confirm ENABLE_VIA_HEALTH_EVAL=true was set before running

  • Check that the summarize() function completed successfully in the console logs

  • Verify the correct path: /tmp/via-logs/ within the container

Performance Impact#

  • If health evaluation significantly impacts performance, consider running separate dedicated profiling sessions

  • Monitor system resources during evaluation-enabled runs

For additional support or questions about VSS pipeline performance evaluation, refer to the main VSS documentation or contact the development team.

For Helm Deployments#

Ensure the following environment variables are set in the overrides file:

vss:
  applicationSpecs:
    vss-deployment:
      containers:
        vss:
          env:
            - name: ENABLE_VIA_HEALTH_EVAL
            value: "true"

            - name: VIA_ENABLE_OTEL
            value: "true" # if you want to have OTEL traces in the logs as well

            - name: VIA_OTEL_EXPORTER
            value: "otlp"

            - name: VIA_OTEL_ENDPOINT
            value: "http://otel-collector:4318"

Note that while the otel-collector and Jaeger services won’t be deployed with Helm, you can set up your own services to use the endpoint.

The health evaluation logs will need to be copied using kubectl or helm commands.