NVIDIA AgentIQ Observability#
The AgentIQ Observability Module provides support for configurable telemetry setup to do logging tracing and metrics for AgentIQ workflows.
Enables users to configure telemetry options from a predefined list based on their preferences.
Listens real-time usage statistics pushed by
IntermediateStepManager
.Translates the usage statistics to OpenTelemetry format and push to the configured provider/method. (e.g., phoenix, OTelCollector, console, file)
These features enable AgentIQ developers to test their workflows locally and integrate observability seamlessly.
Configurable Components#
Users can set up telemetry configuration within the workflow configuration file.
Logging Configuration#
Users can write logs to:
Console (
console
)Temporary file (
file
)Both (by specifying both options)
Configuration Fields#
_type
: Accepted values →console
,file
level
: Log level (e.g.,DEBUG
,INFO
,WARN
,ERROR
)path
(for file logging only): File path where logs will be stored.
Tracing Configuration#
Users can set up tracing using:
Phoenix
Custom providers (See registration section below.)
Configuration Fields#
_type
: The name of the registered provider.endpoint
: The provider’s listening endpoint.project
: The associated project name.
Sample Configuration:
general:
telemetry:
logging:
console:
_type: console
level: WARN
file:
_type: file
path: /tmp/aiq_simple_calculator.log
level: DEBUG
tracing:
phoenix:
_type: phoenix
endpoint: http://localhost:6006/v1/traces
project: simple_calculator
AgentIQ Observability Components#
The Observability components AsyncOtelSpanListener
, leverage the Subject-Observer pattern to subscribe to the IntermediateStep
event stream pushed by IntermediateStepManager
. Acting as an asynchronous event listener, AsyncOtelSpanListener
listens for AgentIQ intermediate step events, collects and efficiently translates them into OpenTelemetry spans, enabling seamless tracing and monitoring.
Process events asynchronously using a dedicated event loop.
Transform function execution boundaries (
FUNCTION_START
,FUNCTION_END
) and intermediate operations (LLM_END
,TOOL_END
) into OpenTelemetry spans.Maintain function ancestry context using
InvocationNode
objects, ensuring distributed tracing across nested function calls, while preserving execution hierarchy.aiq.profiler.decorators
: Defines decorators that can wrap each workflow or LLM framework context manager to inject usage-collection callbacks.callbacks
: Directory that implements callback handlers. These handlers track usage statistics (tokens, time, inputs/outputs) and push them to the AgentIQ usage stats queue. AgentIQ profiling supports callback handlers for LangChain, LLama Index, CrewAI, and Semantic Kernel.
Registering a New Telemetry Provider as a Plugin#
AgentIQ allows users to register custom telemetry providers using the @register_telemetry_exporter
decorator in aiq.observability.register
.
Example:
class PhoenixTelemetryExporter(TelemetryExporterBaseConfig, name="phoenix"):
endpoint: str
project: str
@register_telemetry_exporter(config_type=PhoenixTelemetryExporter)
async def phoenix_telemetry_exporter(config: PhoenixTelemetryExporter, builder: Builder):
from phoenix.otel import HTTPSpanExporter
try:
yield HTTPSpanExporter(endpoint=config.endpoint)
except ConnectionError as ex:
logger.warning("Unable to connect to Phoenix at port 6006. Are you sure Phoenix is running?\n %s",
ex,
exc_info=True)
except Exception as ex:
logger.error("Error in Phoenix telemetry Exporter\n %s", ex, exc_info=True)