Observability#

This page covers observability features for VSS Agents, including distributed tracing via Phoenix.

Overview#

VSS Agents support distributed tracing through Phoenix, an open-source observability platform for AI applications. The NeMo Agent Toolkit (NAT) natively integrates with Phoenix as a built-in telemetry exporter, enabling you to monitor and debug agent workflows.

Phoenix provides:

  • Project-based Organization: Organize traces by workflow or deployment

  • Distributed Tracing: Track agent execution flow, tool calls, and LLM interactions

  • Performance Monitoring: Analyze latency, token usage, and throughput metrics

  • Debugging: Inspect individual traces to understand agent behavior and diagnose issues

Enabling Telemetry#

To enable Phoenix telemetry in the agent service, configure the telemetry settings in your agent’s configuration file. Refer to Agent Configuration for the full configuration reference.

Sample Query Trace#

This example demonstrates a trace for the query:

What videos are available?

The following screenshots show the complete trace visualization with all spans expanded.

Workflow Execution#

The trace begins with the workflow span, which orchestrates the entire agent execution:

Sample Phoenix Query Trace - Workflow and LLM Call

The workflow includes:

  1. LLM Call (nvidia/nvidia-nemotron-nano-9b-v2): The first LLM call processes the user query. Click on this span to view:

    • Input: The user’s question and system prompt

    • Output: The LLM’s reasoning and decision to call a tool

Tool Call and Final Response#

After the initial LLM reasoning, the agent executes the appropriate tool:

Sample Phoenix Query Trace - Tool Call and Final Response

  1. Tool Call (vst_video_list): The agent calls the vst_video_list tool to retrieve available videos. Click on this span to view:

    • Input: Tool parameters (if any)

    • Output: List of available videos returned by the tool

  2. LLM Call (Final Response): The second LLM call takes the tool output and generates the final response for the user. Click on this span to view:

    • Input: The tool results combined with conversation context

    • Output: The formatted response presented to the user

Understanding the Trace Structure#

The trace view shows a hierarchical execution pattern typical of agent workflows:

workflow
├── LLM Call (nvidia/nvidia-nemotron-nano-9b-v2)  → Reasoning & tool selection
│   ├── Input: User query + system prompt
│   └── Output: Decision to call vst_video_list
├── Tool Call (vst_video_list)                    → Tool execution
│   ├── Input: Tool parameters
│   └── Output: List of videos
└── LLM Call (nvidia/nvidia-nemotron-nano-9b-v2)  → Final response generation
    ├── Input: Tool results + context
    └── Output: User-facing response

Inspecting Span Details#

Click on any span to view its details. The right panel displays key metrics:

  • Status: Success indicator (e.g., OK)

  • Start Time: When the span began (e.g., 1/20/2026 16:03:33 PM)

  • End Time: When the span completed (e.g., 1/20/2026 16:03:33 PM)

  • Latency: Duration of the operation (e.g., 0.26s)

  • Total Tokens: Token count for LLM calls (e.g., 2036)

The Info tab shows the span’s input and output:

  • Input: The exact parameters passed to the function or LLM prompt

  • Output: The response or return value from the operation

Key Metrics to Monitor#

When analyzing traces, pay attention to:

  • Total Latency: End-to-end time for the complete workflow

  • LLM Latency: Time spent waiting for LLM responses (often the largest contributor)

  • Tool Execution Time: Duration of individual tool calls (e.g., vst_video_list)

  • Token Usage: Monitor input and output tokens for cost optimization

Viewing Container Logs#

Container logs are essential for diagnosing deployment issues, runtime errors, and service behavior.

Listing Containers#

To see all running containers in your deployment:

# List running containers
docker ps

Viewing Logs#

View logs for a specific container:

# View logs by container name
docker logs <container-name>

# View logs by container ID
docker logs <container-id>

# Follow logs in real-time
docker logs -f <container-name>

# View the last N lines
docker logs --tail 100 <container-name>

Common Log Analysis Tasks#

Search for errors:

docker logs <container-name> | grep -i error