Event Plane Architecture#
This document describes Dynamo’s event plane architecture, which handles service discovery, coordination, and event distribution using etcd and NATS.
Overview#
Dynamo’s coordination layer adapts to the deployment environment:
Deployment |
Service Discovery |
KV Events |
Request Plane |
|---|---|---|---|
Kubernetes (with operator) |
Native K8s (CRDs, EndpointSlices) |
NATS (optional) |
TCP |
Bare metal / Local (default) |
etcd |
NATS (optional) |
TCP |
Note: The runtime always defaults to
kv_store(etcd) for service discovery. Kubernetes deployments must explicitly setDYN_DISCOVERY_BACKEND=kubernetes- the Dynamo operator handles this automatically.
┌─────────────────────────────────────────────────────────────────────┐
│ Coordination Layer │
│ │
│ ┌─────────────────────────┐ ┌─────────────────────────────────┐ │
│ │ Service Discovery │ │ NATS │ │
│ │ │ │ (Optional) │ │
│ │ • K8s: CRDs + API │ │ • KV Cache Events │ │
│ │ • Bare metal: etcd │ │ • Router Replica Sync │ │
│ │ │ │ • JetStream Persistence │ │
│ └─────────────────────────┘ └─────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────┘
│ │
┌──────────┴──────────┐ ┌─────────┴──────────┐
▼ ▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐
│Frontend │ │ Planner │ │ Worker │
└─────────┘ └─────────┘ └─────────┘
Kubernetes-Native Service Discovery#
When running on Kubernetes with the Dynamo operator, service discovery uses native Kubernetes resources instead of etcd.
Configuration#
The operator explicitly sets:
DYN_DISCOVERY_BACKEND=kubernetes
Important: This must be explicitly configured. The runtime defaults to
kv_storein all environments.
How It Works#
DynamoWorkerMetadata CRD: Workers register their endpoints by creating/updating DynamoWorkerMetadata custom resources
EndpointSlices: Used to signal readiness status to the system
K8s API Watches: Components watch for CRD changes to discover available endpoints
Benefits#
No external etcd cluster required
Native integration with Kubernetes lifecycle
Automatic cleanup when pods terminate
Works with standard K8s RBAC
Environment Variables (Injected by Operator)#
Variable |
Description |
|---|---|
|
Set to |
|
Current pod name |
|
Current namespace |
|
Pod unique identifier |
etcd Architecture (Default for All Deployments)#
When DYN_DISCOVERY_BACKEND=kv_store (the global default), etcd is used for service discovery.
Connection Configuration#
etcd connection is configured via environment variables:
Variable |
Description |
Default |
|---|---|---|
|
Comma-separated etcd URLs |
|
|
Basic auth username |
None |
|
Basic auth password |
None |
|
CA certificate path (TLS) |
None |
|
Client certificate path |
None |
|
Client key path |
None |
Example:
export ETCD_ENDPOINTS=http://etcd-0:2379,http://etcd-1:2379,http://etcd-2:2379
Lease Management#
Each DistributedRuntime maintains a primary lease with etcd:
┌────────────────────┐ ┌──────────────┐
│ DistributedRuntime │◄────────│ Primary Lease │
│ │ │ TTL: 10s │
│ • Namespace │ └───────┬───────┘
│ • Components │ │
│ • Endpoints │ │ Keep-Alive
│ │ │ Heartbeat
└────────────────────┘ ▼
┌──────────────┐
│ etcd │
└──────────────┘
Lease Lifecycle:
Creation: Lease created during
DistributedRuntimeinitializationKeep-Alive: Background task sends heartbeats at 50% of remaining TTL
Expiration: If heartbeats stop, lease expires after TTL (10 seconds default)
Cleanup: All keys associated with the lease are automatically deleted
Automatic Recovery:
Reconnection with exponential backoff (50ms to 5s)
Deadline-based retry logic
Cancellation token propagation
Service Discovery#
Endpoints are registered in etcd for dynamic discovery:
Key Format:
/services/{namespace}/{component}/{endpoint}/{instance_id}
Example:
/services/vllm-agg/backend/generate/694d98147d54be25
Registration Data:
{
"namespace": "vllm-agg",
"component": "backend",
"endpoint": "generate",
"instance_id": 7587888160958628000,
"transport": {
"tcp": "192.168.1.10:9999"
}
}
Discovery Queries#
The discovery system supports multiple query patterns:
Query Type |
Pattern |
Use Case |
|---|---|---|
|
|
List all services |
|
|
Filter by namespace |
|
|
Filter by component |
|
|
Specific endpoint |
Watch Functionality#
Clients watch etcd prefixes for real-time updates:
# Client watches for endpoint changes
watcher = etcd.watch_prefix("/services/vllm-agg/backend/generate/")
for event in watcher:
if event.type == "PUT":
# New endpoint registered
add_endpoint(event.value)
elif event.type == "DELETE":
# Endpoint removed (worker died)
remove_endpoint(event.key)
Watch Features:
Initial state retrieval with
get_and_watch_prefix()Automatic reconnection on stream failure
Revision tracking for no-event-loss guarantees
Event types:
PUT(create/update) andDELETE
Distributed Locks#
etcd provides distributed locking for coordination:
Lock Types:
Type |
Key Pattern |
Behavior |
|---|---|---|
Write Lock |
|
Exclusive (no readers/writers) |
Read Lock |
|
Shared (multiple readers) |
Operations:
// Non-blocking write lock
let lock = client.try_write_lock("my_resource").await?;
// Blocking read lock with polling (100ms intervals)
let lock = client.read_lock_with_wait("my_resource").await?;
NATS Architecture#
When NATS is Used#
NATS is used for:
KV Cache Events: Real-time KV cache state updates for routing
Router Replica Sync: Synchronizing router state across replicas
Legacy Request Plane: NATS-based request transport (optional)
Configuration#
Variable |
Description |
Default |
|---|---|---|
|
NATS server URL |
|
Disabling NATS#
For deployments without KV-aware routing:
# Disable NATS and KV events
python -m dynamo.frontend --no-kv-events
This enables “approximate mode” for KV routing without event persistence.
Event Publishing#
Components publish events to NATS subjects:
pub trait EventPublisher {
async fn publish(&self, event: &str, data: &[u8]) -> Result<()>;
async fn publish_serialized<T: Serialize>(&self, event: &str, data: &T) -> Result<()>;
}
Subject Naming:
{base_subject}.{event_name}
Example:
vllm-agg.backend.kv_cache_update
Event Subscription#
Components subscribe to events:
pub trait EventSubscriber {
async fn subscribe(&self, topic: &str) -> Result<Subscriber>;
async fn subscribe_typed<T: DeserializeOwned>(&self, topic: &str) -> Result<TypedSubscriber<T>>;
}
JetStream Persistence#
For durable event delivery, NATS JetStream provides:
Message persistence
Replay from offset
Consumer groups for load balancing
Acknowledgment tracking
Key-Value Store Abstraction#
Dynamo provides a unified KV store interface supporting multiple backends:
Supported Backends#
Backend |
Use Case |
Configuration |
|---|---|---|
|
Production deployments |
|
|
Testing, development |
Default |
|
NATS-only deployments |
|
|
Local persistence |
File path |
Store Interface#
pub trait KvStore {
async fn get(&self, bucket: &str, key: &str) -> Result<Option<Vec<u8>>>;
async fn put(&self, bucket: &str, key: &str, value: &[u8]) -> Result<()>;
async fn delete(&self, bucket: &str, key: &str) -> Result<()>;
async fn watch(&self, bucket: &str) -> Result<WatchStream>;
}
Buckets#
Data is organized into logical buckets:
Bucket |
Purpose |
|---|---|
|
Endpoint instance registry |
|
Model deployment cards |
Typed Prefix Watcher#
For type-safe watching of etcd prefixes:
// Watch and maintain HashMap of deserialized values
let watcher = watch_prefix_with_extraction::<DiscoveryInstance>(
&etcd_client,
"/services/vllm-agg/",
lease_id_extractor,
value_extractor,
).await?;
// Receive updates via watch channel
let instances = watcher.borrow();
Key Extractors:
Extractor |
Description |
|---|---|
|
Use lease ID as key |
|
Extract key with prefix stripping |
|
Use full etcd key |
Reliability Features#
Connection Resilience#
etcd Reconnection:
Exponential backoff: 50ms to 5s
Deadline-based retry logic
Mutex ensures single concurrent reconnect
NATS Reconnection:
Built-in reconnection in NATS client
Configurable max reconnect attempts
Buffering during disconnection
Lease-Based Cleanup#
When a worker crashes or loses connectivity:
Keep-alive heartbeats stop
Lease expires after TTL (10 seconds)
All registered endpoints automatically deleted
Clients receive DELETE watch events
Traffic reroutes to healthy workers
Transaction Safety#
etcd transactions ensure atomic operations:
// Atomic create-if-not-exists
let txn = Txn::new()
.when([Compare::create_revision(key, CompareOp::Equal, 0)])
.and_then([Op::put(key, value, options)]);
etcd_client.txn(txn).await?;
This prevents race conditions in concurrent service registration.
Operational Modes#
Kubernetes Mode (Requires Explicit Configuration)#
Native Kubernetes service discovery:
# Operator explicitly sets this (not auto-detected):
export DYN_DISCOVERY_BACKEND=kubernetes
# Workers register via K8s CRDs
python -m dynamo.vllm --model Qwen/Qwen3-0.6B
# Frontend discovers workers via K8s API
python -m dynamo.frontend
No etcd or NATS required for basic operation when using K8s discovery.
KV Store Mode (Global Default)#
Full service discovery with etcd:
# This is the default - no configuration needed
# export DYN_DISCOVERY_BACKEND=kv_store # (implicit)
# Workers register with etcd
python -m dynamo.vllm --model Qwen/Qwen3-0.6B
# Frontend discovers workers via etcd
python -m dynamo.frontend
KV-Aware Routing (Optional)#
Enable NATS for KV cache event tracking:
# Default: KV events enabled (requires NATS)
python -m dynamo.frontend --router-mode kv
# Disable KV events for prediction-based routing (no NATS)
python -m dynamo.frontend --router-mode kv --no-kv-events
With --no-kv-events:
Router predicts cache state based on routing decisions
TTL-based expiration and LRU pruning
No NATS infrastructure required
Best Practices#
1. Use Kubernetes Discovery on K8s#
The Dynamo operator automatically sets DYN_DISCOVERY_BACKEND=kubernetes for pods. No additional setup required when using the operator.
2. For Bare Metal: Deploy etcd Cluster#
For bare-metal production deployments, deploy a 3-node etcd cluster for high availability.
3. Configure Appropriate TTLs (etcd mode)#
Balance between detection speed and overhead:
Short TTL (5s): Faster failure detection, more keep-alive traffic
Long TTL (30s): Less overhead, slower detection
4. KV Routing Without NATS#
For simpler deployments without NATS:
# Use prediction-based KV routing
python -m dynamo.frontend --router-mode kv --no-kv-events
This provides KV-aware routing with reduced accuracy but no NATS dependency.