For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Blog
DocsAPI Reference
DocsAPI Reference
    • AIStore
    • Documentation
  • Core Documentation
    • In-depth Overview
    • Terminology and core abstractions
    • Getting Started
    • Networking model
    • Buckets: design, operations, namespaces, and system buckets
    • Observability overview
    • CLI overview
    • Production deployment
    • Technical Blog
  • APIs, SDKs, and Compatibility
    • Go API
    • Python SDK
    • PyPI package
    • Python SDK reference guide
    • PyTorch integration
    • TensorFlow integration
    • HTTP API reference
    • curl examples
    • Easy URL
    • S3 compatibility
    • s3cmd quick start
    • Presigned S3 requests
    • Boto3 support
  • Command-Line Interface
    • CLI overview
    • ais help
    • CLI reference guide
    • Bucket operations
    • Cluster and remote-cluster management
    • Storage and mountpath management
    • Monitoring and ais show
    • Downloads
    • Jobs
    • Authentication and access control
    • Configuration via CLI
    • ETL CLI
    • Distributed shuffle CLI
    • ML / get-batch CLI
    • GCP credentials
    • TLS certificate management
  • Storage and Data Management
    • Storage services
    • Buckets: design, operations, namespaces, and system buckets
    • Native Bucket Inventory (NBI)
    • Backend providers
    • On-disk layout
    • Virtual directories
    • System files
    • Evicting remote buckets and cached data
  • Cluster Operations
    • Node lifecycle: maintenance, shutdown, decommission
    • Global rebalance
    • Resilver
    • AIS in Containerized Environments
    • Highly available control plane
    • Information Center (IC)
    • Out-of-band updates
    • Troubleshooting
  • Configuration and Security
    • Configuration
    • Environment variables
    • Feature flags
    • AuthN and access control
    • Authentication validation
    • HTTPS and certificates
    • Switching a cluster to HTTPS
  • ETL and Advanced Workflows
    • ETL overview
    • ETL CLI docs
    • ETL Python SDK examples
    • Custom transformers
    • ETL Python webserver SDK
    • ETL Go webserver package
    • Archives: read, write, and list
    • Distributed shuffle (dsort)
    • Initial sharding utility (ishard)
    • Downloader
    • Blob Downloader
    • Batch object retrieval (get-batch)
    • Batch operations
    • Tools and utilities
    • Extended actions (xactions)
  • Observability, Monitoring, and Performance
    • Observability overview
    • Monitoring with CLI
    • Logs
    • Prometheus integration
    • Metrics reference
    • Grafana dashboards
    • Kubernetes monitoring
    • Distributed tracing
    • Monitoring get-batch
    • AIS load generator (aisloader)
    • Benchmarking AIStore
    • Performance tuning and testing
    • Performance monitoring via CLI
    • Rate limiting
    • Checksumming
    • Filesystem Health Checker (FSHC)
    • Traffic patterns
  • Networking
    • Networking: multi-homing, network separation, IPv6
    • HTTPS configuration
    • Switching to HTTPS
    • Idle connections
    • MessagePack protocol
  • Deployment
    • AIStore on Kubernetes
    • Kubernetes Operator
    • Ansible playbooks
    • Helm charts
    • Deployment monitoring
    • Docker
  • Developer Resources
    • Development guide
    • aisnode command line
    • Build tags
  • Object and Bucket Naming
    • Unicode and special symbols in object and bucket names
    • Extremely long object names
Blog
NVIDIANVIDIA
Developer-friendly docs for your API
Privacy Policy | Your Privacy Choices | Terms of Service | Accessibility | Corporate Policies | Product Security | Contact

Copyright © 2026, NVIDIA Corporation.

LogoLogoAIStore
On this page
  • Table of Contents
  • Getting Started
  • Pre-Requisite
  • Example operations
  • Configuration
  • Build AIStore with tracing
Observability, Monitoring, and Performance

Distributed tracing

||View as Markdown|
Previous

Kubernetes monitoring

Next

Monitoring get-batch

AIStore supports distributed tracing via OpenTelemetry (OTEL), enhancing its observability capabilities alongside existing extensive metrics and logging features. Distributed tracing enables tracking client requests across AIStore’s proxy and target daemons, providing better visibility into the request flow and offering valuable performance insights

For more details:

  • Understanding Distributed Tracing
  • What is OpenTelemetry

WARNING: Enabling distributed tracing introduces slight overhead in AIStore’s critical data path. Enable this feature only after carefully considering its performance impact and ensuring that the benefits of enhanced observability justify the potential trade-offs.

Table of Contents

  • Getting Started
    • Example operations
  • Configuration
    • Build AIStore with tracing

Getting Started

In this section, we use AIStore Local Playground and local Jaeger. This is done for purely (easy-to-use-and-repropduce) demonsration purposes.

Pre-Requisite

  • Docker
  1. Local Jaeger setup

    1docker run -d --name jaeger \
    2-e COLLECTOR_OTLP_ENABLED=true \
    3-p 16686:16686 \
    4-p 4317:4317 \
    5-p 4318:4318 \
    6jaegertracing/all-in-one:latest
  2. Optionally, shutdown and cleanup Local Playground:

    1make kill clean
  3. Deploy the cluster with AuthN enabled:

    1AIS_TRACING_ENDPOINT="localhost:4317" make deploy

    This will start up an AIStore cluster with distributed-tracing enabled.

Example operations

1ais bucket create ais://nnn
2ais put README.md ais://nnn
3ais get ais://nnn/README.md /dev/null

View traces at: http://localhost:16686

Configuration

Cluster-wide tracing configuration. For list of AIStore config options refer to configuration.md.

Option nameDefault valueDescription
tracing.enabledfalseIf true, enables distributed tracing
tracing.exporter_endpoint''OTEL exporter gRPC endpoint
tracing.service_name_prefixaistorePrefix added to OTEL service name reported by exporter
tracing.attributes{}Extra attributes to be added the traces
tracing.sampler_probablity1 (export all traces)Percentage of traces to sample [0,1]
tracing.skip_verifyfalseAllow insecure (TLS) exporter gRPC connection
tracing.exporter_auth.token_header''Request header used for exporter auth token
tracing.exporter_auth.token_file''Filepath to obtain exporter auth token

Sample aistore cluster configuration:

1{
2 ...
3 "tracing": {
4 "enabled": true,
5 "exporter_endpoint": "localhost:4317",
6 "skip_verify": true,
7 "service_name_prefix": "aistore",
8 "sampler_probability": "1.0"
9 },
10 ...
11}

Build AIStore with tracing

Distributed tracing is a build-time option controlled using oteltracing build tag.

When aisnode binary is built without this build tag, tracing configuration is ignored and the entire tracing functionality becomes a no-op.

1# build with tracing support
2TAGS=oteltracing make node
3
4# build without tracing support
5make node