Frontend Configuration Reference
This page documents all configuration options for the Dynamo Frontend (python -m dynamo.frontend).
Every CLI argument has a corresponding environment variable. CLI arguments take precedence over environment variables.
HTTP & Networking
The Rust HTTP server also reads these environment variables (not exposed as CLI args):
Router
AIC Prefill Load Model
These options are used only when --router-mode kv is combined with --router-prefill-load-model aic.
When enabled, the frontend’s embedded KV router predicts one expected prefill duration per admitted request, using the selected worker’s overlap-derived cached prefix. The router then decays only the oldest active prefill request on each worker for prompt-side load accounting.
Fault Tolerance
Model Discovery
Infrastructure
KServe gRPC
See the Frontend Guide for KServe message formats and integration details.
Monitoring
Tokenizer
Experimental
HTTP Endpoints
The frontend exposes the following HTTP endpoints:
OpenAI-Compatible
Anthropic (Experimental)
Infrastructure
Endpoint Path Customization
All endpoint paths can be overridden via environment variables:
Deprecated
See Also
- Frontend Overview — quick start and feature matrix
- Frontend Guide — KServe gRPC configuration
- NVIDIA Request Extensions (nvext) — custom request fields
- Configuration and Tuning — detailed routing configuration
- Metrics — available Prometheus metrics
- Fault Tolerance — request migration and rejection