Environment Variables
AIPerf can be configured using environment variables with the AIPERF_ prefix.
All settings are organized into logical subsystems for better discoverability.
Pattern: AIPERF_{SUBSYSTEM}_{SETTING_NAME}
Examples:
Environment variable names, default values, and definitions are subject to change. These settings may be modified, renamed, or removed in future releases.
APISERVER
API server settings. Controls the host and port of the API server.
COMPRESSION
Compression settings for streaming file transfers. Controls chunk size and compression levels for zstd and gzip encodings used in dataset and results file transfers.
CONFIG
Configuration file paths for distributed deployments. Controls paths to configuration files loaded by services running in containers. These are primarily used by aiperf service when running in Kubernetes.
DATASET
Dataset loading and configuration. Controls timeouts and behavior for dataset loading operations, as well as memory-mapped dataset storage settings.
GPU
GPU telemetry collection configuration. Controls GPU metrics collection frequency, endpoint detection, and shutdown behavior. Metrics are collected from DCGM endpoints at the specified interval.
HTTP
HTTP client socket and connection configuration. Controls low-level socket options, keepalive settings, DNS caching, and connection pooling for HTTP clients. These settings optimize performance for high-throughput streaming workloads. Video Generation Polling: For async video generation APIs that use job polling (e.g., SGLang /v1/videos), the poll interval is controlled by AIPERF_HTTP_VIDEO_POLL_INTERVAL. The max poll time uses the —request-timeout-seconds CLI argument.
LOGGING
Logging system configuration. Controls multiprocessing log queue size and other logging behavior.
METRICS
Metrics collection and storage configuration. Controls metrics storage allocation and collection behavior.
RECORD
Record processing and export configuration. Controls batch sizes, processor scaling, and progress reporting for record processing.
SERVERMETRICS
Server metrics collection configuration. Controls server metrics collection frequency, endpoint detection, and shutdown behavior. Metrics are collected from Prometheus-compatible endpoints at the specified interval. Use --no-server-metrics CLI flag to disable collection.
SERVICE
Service lifecycle and inter-service communication configuration. Controls timeouts for service registration, startup, shutdown, command handling, connection probing, heartbeats, and profile operations.
TIMING
Timing manager configuration. Controls timing-related settings for credit phase execution and scheduling.
UI
User interface and dashboard configuration. Controls refresh rates, update thresholds, and notification behavior for the various UI modes (dashboard, tqdm, etc.).
WORKER
Worker management and auto-scaling configuration. Controls worker pool sizing, health monitoring, load detection, and recovery behavior. The CPU_UTILIZATION_FACTOR is used in the auto-scaling formula: max_workers = max(1, min(int(cpu_count * factor) - 1, MAX_WORKERS_CAP))
ZMQ
ZMQ socket and communication configuration. Controls ZMQ socket timeouts, keepalive settings, retry behavior, and concurrency limits. These settings affect reliability and performance of the internal message bus.
DEV
Development and debugging configuration. Controls developer-focused features like debug logging, profiling, and internal metrics. These settings are typically disabled in production environments.