Distributed tracing
AIStore supports distributed tracing via OpenTelemetry (OTEL), enhancing its observability capabilities alongside existing extensive metrics and logging features. Distributed tracing enables tracking client requests across AIStore’s proxy and target daemons, providing better visibility into the request flow and offering valuable performance insights
For more details:
WARNING: Enabling distributed tracing introduces slight overhead in AIStore’s critical data path. Enable this feature only after carefully considering its performance impact and ensuring that the benefits of enhanced observability justify the potential trade-offs.
Table of Contents
Getting Started
In this section, we use AIStore Local Playground and local Jaeger. This is done for purely (easy-to-use-and-repropduce) demonsration purposes.
Pre-Requisite
- Docker
-
Local Jaeger setup
-
Optionally, shutdown and cleanup Local Playground:
-
Deploy the cluster with AuthN enabled:
This will start up an AIStore cluster with distributed-tracing enabled.
Example operations
View traces at: http://localhost:16686
Configuration
Cluster-wide tracing configuration. For list of AIStore config options refer to configuration.md.
Sample aistore cluster configuration:
Build AIStore with tracing
Distributed tracing is a build-time option controlled using oteltracing build tag.
When aisnode binary is built without this build tag, tracing configuration is ignored and the entire tracing functionality becomes a no-op.