Dynamo Request Planes User Guide#
Overview#
Dynamo supports multiple transport mechanisms for its request plane (the communication layer between services). You can choose from three different request plane modes based on your deployment requirements:
TCP (default): Direct TCP connection for optimal performance
NATS: Message broker-based request plane
HTTP: HTTP/2-based request plane
This guide explains how to configure and use request plane in your Dynamo deployment.
What is a Request Plane?#
The request plane is the transport layer that handles communication between Dynamo services (e.g., frontend to backend, worker to worker). Different request planes offer different trade-offs:
Request Plane |
Suitable For |
Characteristics |
|---|---|---|
NATS |
Production deployments with KV routing |
Requires NATS infrastructure, provides pub/sub patterns, highest flexibility |
TCP |
Low-latency direct communication |
Direct connections, minimal overhead |
HTTP |
Standard deployments, debugging |
HTTP/2 protocol, easier observability with standard tools, widely compatible |
Request Plane vs KV Event Plane#
Dynamo has two independent communication planes:
Request plane (
DYN_REQUEST_PLANE): how RPC requests flow between components (frontend → router → worker), viatcp,http, ornats.KV event plane (currently only NATS is supported): how KV cache events (and optional router replica sync) are distributed/persisted for KV-aware routing.
Note: If you are using tcp or http request plane with KV events enabled (default), NATS is automatically initialized. You can optionally configure NATS_SERVER environment variable (e.g., NATS_SERVER=nats://nats-hostname:port) to specify a custom NATS server; otherwise, it defaults to localhost:4222. To completely disable NATS, use --no-kv-events on the frontend.
Because they are independent, you can mix them.
For example, a deployment with TCP request plane can use different KV event planes:
JetStream KV events: requests use TCP, KV routing still uses NATS JetStream + object store for persistence.
NATS Core KV events (local indexer): requests use TCP, KV events use NATS Core pub/sub and persistence lives on workers.
no KV events: requests use TCP and KV routing runs without events (no NATS required, but no event-backed persistence).
Configuration#
Environment Variable#
Set the request plane mode using the DYN_REQUEST_PLANE environment variable:
export DYN_REQUEST_PLANE=<mode>
Where <mode> is one of:
tcp(default)natshttp
The value is case-insensitive.
Default Behavior#
If DYN_REQUEST_PLANE is not set or contains an invalid value, Dynamo defaults to tcp.
Usage Examples#
Using TCP (Default)#
TCP is the default request plane and provides direct, low-latency communication between services.
Configuration:
# TCP is the default, so no need to set DYN_REQUEST_PLANE explicitly
# But you can explicitly set it if desired:
export DYN_REQUEST_PLANE=tcp
# Optional: Configure TCP server host and port
export DYN_TCP_RPC_HOST=0.0.0.0 # Default host
# export DYN_TCP_RPC_PORT=9999 # Optional: specify a fixed port
# Run your Dynamo service
DYN_REQUEST_PLANE=tcp python -m dynamo.frontend --http-port=8000 &
DYN_REQUEST_PLANE=tcp python -m dynamo.vllm --model Qwen/Qwen3-0.6B
Note: By default, TCP uses an OS-assigned free port (port 0). This is ideal for environments where multiple services may run on the same machine or when you want to avoid port conflicts. If you need a specific port (e.g., for firewall rules), set DYN_TCP_RPC_PORT explicitly.
When to use TCP:
Simple deployments with direct service-to-service communication (e.g. frontend to backend)
Minimal infrastructure requirements (NATS is initialized by default for KV events but can be disabled with
--no-kv-events)Low-latency requirements
TCP Configuration Options:
Additional TCP-specific environment variables:
DYN_TCP_RPC_HOST: Server host address (default: auto-detected)DYN_TCP_RPC_PORT: Server port. If not set, the OS assigns a free port automatically (recommended for most deployments). Set explicitly only if you need a specific port for firewall rules.DYN_TCP_MAX_MESSAGE_SIZE: Maximum message size for TCP client (default: 32MB)DYN_TCP_REQUEST_TIMEOUT: Request timeout for TCP client (default: 10 seconds)DYN_TCP_POOL_SIZE: Connection pool size for TCP client (default: 50)DYN_TCP_CONNECT_TIMEOUT: Connect timeout for TCP client (default: 3 seconds)DYN_TCP_CHANNEL_BUFFER: Request channel buffer size for TCP client (default: 100)
Using HTTP#
HTTP/2 provides a standards-based request plane that’s easy to debug and widely compatible.
Configuration:
# Optional: Configure HTTP server host and port
export DYN_HTTP_RPC_HOST=0.0.0.0 # Default host
export DYN_HTTP_RPC_PORT=8888 # Default port
export DYN_HTTP_RPC_ROOT_PATH=/v1/rpc # Default path
# Run your Dynamo service
DYN_REQUEST_PLANE=http python -m dynamo.frontend --http-port=8000 &
DYN_REQUEST_PLANE=http python -m dynamo.vllm --model Qwen/Qwen3-0.6B
When to use HTTP:
Standard deployments requiring HTTP compatibility
Debugging scenarios (use curl, browser tools, etc.)
Integration with HTTP-based infrastructure
Load balancers and proxies that work with HTTP
HTTP Configuration Options:
Additional HTTP-specific environment variables:
DYN_HTTP_RPC_HOST: Server host address (default: auto-detected)DYN_HTTP_RPC_PORT: Server port (default: 8888)DYN_HTTP_RPC_ROOT_PATH: Root path for RPC endpoints (default: /v1/rpc)
DYN_HTTP2_*: Various HTTP/2 client configuration options
DYN_HTTP2_MAX_FRAME_SIZE: Maximum frame size for HTTP client (default: 1MB)DYN_HTTP2_MAX_CONCURRENT_STREAMS: Maximum concurrent streams for HTTP client (default: 1000)DYN_HTTP2_POOL_MAX_IDLE_PER_HOST: Maximum idle connections per host for HTTP client (default: 100)DYN_HTTP2_POOL_IDLE_TIMEOUT_SECS: Idle timeout for HTTP client (default: 90 seconds)DYN_HTTP2_KEEP_ALIVE_INTERVAL_SECS: Keep-alive interval for HTTP client (default: 30 seconds)DYN_HTTP2_KEEP_ALIVE_TIMEOUT_SECS: Keep-alive timeout for HTTP client (default: 10 seconds)DYN_HTTP2_ADAPTIVE_WINDOW: Enable adaptive flow control (default: true)
Using NATS#
NATS provides durable jetstream messaging for request plane and can be used for KV events (and router replica sync).
Prerequisites:
NATS server must be running and accessible
Configure NATS connection via standard Dynamo NATS environment variables
# Explicitly set to NATS
export DYN_REQUEST_PLANE=nats
# Run your Dynamo service
DYN_REQUEST_PLANE=nats python -m dynamo.frontend --http-port=8000 &
DYN_REQUEST_PLANE=nats python -m dynamo.vllm --model Qwen/Qwen3-0.6B
When to use NATS:
Production deployments with service discovery
KV-aware routing with accurate cache state tracking (requires NATS for event transport). Note: approximate mode (
--no-kv-events) provides KV routing without NATS but with reduced accuracy.Need for message replay and persistence features
Limitations:
NATS does not support payloads beyond 16MB (use TCP for larger payloads)
Complete Example#
Here’s a complete example showing how to launch a Dynamo deployment with different request planes:
See examples/backends/vllm/launch/agg_request_planes.sh for a complete working example that demonstrates launching Dynamo with TCP, HTTP, or NATS request planes.
Real-World Example#
The Dynamo repository includes a complete example demonstrating all three request planes:
Location: examples/backends/vllm/launch/agg_request_planes.sh
cd examples/backends/vllm/launch
# Run with TCP
./agg_request_planes.sh --tcp
# Run with HTTP
./agg_request_planes.sh --http
# Run with NATS
./agg_request_planes.sh --nats
Architecture Details#
Network Manager#
The request plane implementation is centralized in the Network Manager (lib/runtime/src/pipeline/network/manager.rs), which:
Reads the
DYN_REQUEST_PLANEenvironment variable at startupCreates the appropriate server and client implementations
Provides a transport-agnostic interface to the rest of the codebase
Manages all network configuration and lifecycle
Transport Abstraction#
All request plane implementations conform to common trait interfaces:
RequestPlaneServer: Server-side interface for receiving requestsRequestPlaneClient: Client-side interface for sending requests
This abstraction means your application code doesn’t need to change when switching request planes.
Configuration Loading#
Request plane configuration is loaded from environment variables at startup and cached globally. The configuration hierarchy is:
Mode Selection:
DYN_REQUEST_PLANE(defaults totcp)Transport-Specific Config: Mode-specific environment variables (e.g.,
DYN_TCP_*,DYN_HTTP2_*)
Migration Guide#
From NATS to TCP#
Stop your Dynamo services
Set environment variable
DYN_REQUEST_PLANE=tcpOptionally configure TCP-specific settings (e.g.,
DYN_TCP_RPC_HOST). Note:DYN_TCP_RPC_PORTis optional; if not set, an OS-assigned free port is used automatically.Restart your services
From NATS to HTTP#
Stop your Dynamo services
Set environment variable
DYN_REQUEST_PLANE=httpOptionally configure HTTP-specific settings (
DYN_HTTP_RPC_PORT, etc.)Restart your services
Testing the Migration#
After switching request planes, verify your deployment:
# Test with a simple request
curl http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "Qwen/Qwen3-0.6B",
"messages": [{"role": "user", "content": "Hello!"}]
}'
Troubleshooting#
Issue: Services Can’t Communicate#
Symptoms: Requests timeout or fail to reach the backend
Solutions:
Verify all services use the same
DYN_REQUEST_PLANEsettingCheck that server ports are not blocked by k8s network policies or firewalls
For TCP/HTTP: Ensure host/port configurations are correct and accessible
For NATS: Verify NATS server is running and accessible
Issue: “Invalid request plane mode” Error#
Symptoms: Service fails to start with configuration error
Solutions:
Check
DYN_REQUEST_PLANEspelling (valid values:nats,tcp,http)Value is case-insensitive but must be one of the three options
If not set, defaults to
tcp
Issue: Port Conflicts#
Symptoms: Server fails to start due to “address already in use”
Solutions:
TCP: By default, TCP uses an OS-assigned free port, so port conflicts should be rare. If you explicitly set
DYN_TCP_RPC_PORTto a specific port and get conflicts, either change the port or remove the setting to use automatic port assignment.HTTP default port: 8888 (adjust environment variable
DYN_HTTP_RPC_PORT)
Performance Considerations#
Latency#
TCP: Lowest latency due to direct connections and binary serialization
HTTP: Moderate latency with HTTP/2 overhead
NATS: Moderate latency due to nats jet stream persistence
Resource Usage#
TCP: Minimal infrastructure (NATS required only if using KV events, can disable with
--no-kv-events)HTTP: Minimal infrastructure (NATS required only if using KV events, can disable with
--no-kv-events)NATS: Requires running NATS server (additional memory/CPU)