Event Plane Architecture#

This document describes Dynamo’s event plane architecture, which handles service discovery, coordination, and event distribution using etcd and NATS.

Overview#

Dynamo’s coordination layer adapts to the deployment environment:

Deployment	Service Discovery	KV Events	Request Plane
Kubernetes (with operator)	Native K8s (CRDs, EndpointSlices)	NATS (optional)	TCP
Bare metal / Local (default)	etcd	NATS (optional)	TCP

Note: The runtime always defaults to kv_store (etcd) for service discovery. Kubernetes deployments must explicitly set DYN_DISCOVERY_BACKEND=kubernetes - the Dynamo operator handles this automatically.

┌─────────────────────────────────────────────────────────────────────┐
│                    Coordination Layer                                │
│                                                                      │
│  ┌─────────────────────────┐    ┌─────────────────────────────────┐ │
│  │   Service Discovery     │    │            NATS                 │ │
│  │                         │    │         (Optional)              │ │
│  │  • K8s: CRDs + API      │    │  • KV Cache Events              │ │
│  │  • Bare metal: etcd     │    │  • Router Replica Sync          │ │
│  │                         │    │  • JetStream Persistence        │ │
│  └─────────────────────────┘    └─────────────────────────────────┘ │
│                                                                      │
└─────────────────────────────────────────────────────────────────────┘
                    │                          │
         ┌──────────┴──────────┐    ┌─────────┴──────────┐
         ▼                     ▼    ▼                    ▼
    ┌─────────┐          ┌─────────┐              ┌─────────┐
    │Frontend │          │ Planner │              │ Worker  │
    └─────────┘          └─────────┘              └─────────┘

Kubernetes-Native Service Discovery#

When running on Kubernetes with the Dynamo operator, service discovery uses native Kubernetes resources instead of etcd.

Configuration#

The operator explicitly sets:

DYN_DISCOVERY_BACKEND=kubernetes

Important: This must be explicitly configured. The runtime defaults to kv_store in all environments.

How It Works#

DynamoWorkerMetadata CRD: Workers register their endpoints by creating/updating DynamoWorkerMetadata custom resources
EndpointSlices: Used to signal readiness status to the system
K8s API Watches: Components watch for CRD changes to discover available endpoints

Benefits#

No external etcd cluster required
Native integration with Kubernetes lifecycle
Automatic cleanup when pods terminate
Works with standard K8s RBAC

Environment Variables (Injected by Operator)#

Variable	Description
`DYN_DISCOVERY_BACKEND`	Set to `kubernetes`
`POD_NAME`	Current pod name
`POD_NAMESPACE`	Current namespace
`POD_UID`	Pod unique identifier

etcd Architecture (Default for All Deployments)#

When DYN_DISCOVERY_BACKEND=kv_store (the global default), etcd is used for service discovery.

Connection Configuration#

etcd connection is configured via environment variables:

Variable	Description	Default
`ETCD_ENDPOINTS`	Comma-separated etcd URLs	`http://localhost:2379`
`ETCD_AUTH_USERNAME`	Basic auth username	None
`ETCD_AUTH_PASSWORD`	Basic auth password	None
`ETCD_AUTH_CA`	CA certificate path (TLS)	None
`ETCD_AUTH_CLIENT_CERT`	Client certificate path	None
`ETCD_AUTH_CLIENT_KEY`	Client key path	None

Example:

export ETCD_ENDPOINTS=http://etcd-0:2379,http://etcd-1:2379,http://etcd-2:2379

Lease Management#

Each DistributedRuntime maintains a primary lease with etcd:

┌────────────────────┐         ┌──────────────┐
│ DistributedRuntime │◄────────│ Primary Lease │
│                    │         │  TTL: 10s     │
│  • Namespace       │         └───────┬───────┘
│  • Components      │                 │
│  • Endpoints       │                 │ Keep-Alive
│                    │                 │ Heartbeat
└────────────────────┘                 ▼
                               ┌──────────────┐
                               │     etcd     │
                               └──────────────┘

Lease Lifecycle:

Creation: Lease created during DistributedRuntime initialization
Keep-Alive: Background task sends heartbeats at 50% of remaining TTL
Expiration: If heartbeats stop, lease expires after TTL (10 seconds default)
Cleanup: All keys associated with the lease are automatically deleted

Automatic Recovery:

Reconnection with exponential backoff (50ms to 5s)
Deadline-based retry logic
Cancellation token propagation

Service Discovery#

Endpoints are registered in etcd for dynamic discovery:

Key Format:

/services/{namespace}/{component}/{endpoint}/{instance_id}

Example:

/services/vllm-agg/backend/generate/694d98147d54be25

Registration Data:

{
  "namespace": "vllm-agg",
  "component": "backend",
  "endpoint": "generate",
  "instance_id": 7587888160958628000,
  "transport": {
    "tcp": "192.168.1.10:9999"
  }
}

Discovery Queries#

The discovery system supports multiple query patterns:

Query Type	Pattern	Use Case
`AllEndpoints`	`/services/`	List all services
`NamespacedEndpoints`	`/services/{namespace}/`	Filter by namespace
`ComponentEndpoints`	`/services/{namespace}/{component}/`	Filter by component
`Endpoint`	`/services/{namespace}/{component}/{endpoint}/`	Specific endpoint

Watch Functionality#

Clients watch etcd prefixes for real-time updates:

# Client watches for endpoint changes
watcher = etcd.watch_prefix("/services/vllm-agg/backend/generate/")

for event in watcher:
    if event.type == "PUT":
        # New endpoint registered
        add_endpoint(event.value)
    elif event.type == "DELETE":
        # Endpoint removed (worker died)
        remove_endpoint(event.key)

Watch Features:

Initial state retrieval with get_and_watch_prefix()
Automatic reconnection on stream failure
Revision tracking for no-event-loss guarantees
Event types: PUT (create/update) and DELETE

Distributed Locks#

etcd provides distributed locking for coordination:

Lock Types:

Type	Key Pattern	Behavior
Write Lock	`v1/{prefix}/writer`	Exclusive (no readers/writers)
Read Lock	`v1/{prefix}/readers/{id}`	Shared (multiple readers)

Operations:

// Non-blocking write lock
let lock = client.try_write_lock("my_resource").await?;

// Blocking read lock with polling (100ms intervals)
let lock = client.read_lock_with_wait("my_resource").await?;

NATS Architecture#

When NATS is Used#

NATS is used for:

KV Cache Events: Real-time KV cache state updates for routing
Router Replica Sync: Synchronizing router state across replicas
Legacy Request Plane: NATS-based request transport (optional)

Configuration#

Variable	Description	Default
`NATS_SERVER`	NATS server URL	`nats://localhost:4222`

Disabling NATS#

For deployments without KV-aware routing:

# Disable NATS and KV events
python -m dynamo.frontend --no-kv-events

This enables “approximate mode” for KV routing without event persistence.

Event Publishing#

Components publish events to NATS subjects:

pub trait EventPublisher {
    async fn publish(&self, event: &str, data: &[u8]) -> Result<()>;
    async fn publish_serialized<T: Serialize>(&self, event: &str, data: &T) -> Result<()>;
}

Subject Naming:

{base_subject}.{event_name}

Example:

vllm-agg.backend.kv_cache_update

Event Subscription#

Components subscribe to events:

pub trait EventSubscriber {
    async fn subscribe(&self, topic: &str) -> Result<Subscriber>;
    async fn subscribe_typed<T: DeserializeOwned>(&self, topic: &str) -> Result<TypedSubscriber<T>>;
}

JetStream Persistence#

For durable event delivery, NATS JetStream provides:

Message persistence
Replay from offset
Consumer groups for load balancing
Acknowledgment tracking

Key-Value Store Abstraction#

Dynamo provides a unified KV store interface supporting multiple backends:

Supported Backends#

Backend	Use Case	Configuration
`EtcdStore`	Production deployments	`ETCD_ENDPOINTS`
`MemoryStore`	Testing, development	Default
`NatsStore`	NATS-only deployments	`NATS_SERVER`
`FileStore`	Local persistence	File path

Store Interface#

pub trait KvStore {
    async fn get(&self, bucket: &str, key: &str) -> Result<Option<Vec<u8>>>;
    async fn put(&self, bucket: &str, key: &str, value: &[u8]) -> Result<()>;
    async fn delete(&self, bucket: &str, key: &str) -> Result<()>;
    async fn watch(&self, bucket: &str) -> Result<WatchStream>;
}

Buckets#

Data is organized into logical buckets:

Bucket	Purpose
`v1/instances`	Endpoint instance registry
`v1/mdc`	Model deployment cards

Typed Prefix Watcher#

For type-safe watching of etcd prefixes:

// Watch and maintain HashMap of deserialized values
let watcher = watch_prefix_with_extraction::<DiscoveryInstance>(
    &etcd_client,
    "/services/vllm-agg/",
    lease_id_extractor,
    value_extractor,
).await?;

// Receive updates via watch channel
let instances = watcher.borrow();

Key Extractors:

Extractor	Description
`lease_id()`	Use lease ID as key
`key_string()`	Extract key with prefix stripping
`full_key_string()`	Use full etcd key

Reliability Features#

Connection Resilience#

etcd Reconnection:

Exponential backoff: 50ms to 5s
Deadline-based retry logic
Mutex ensures single concurrent reconnect

NATS Reconnection:

Built-in reconnection in NATS client
Configurable max reconnect attempts
Buffering during disconnection

Lease-Based Cleanup#

When a worker crashes or loses connectivity:

Keep-alive heartbeats stop
Lease expires after TTL (10 seconds)
All registered endpoints automatically deleted
Clients receive DELETE watch events
Traffic reroutes to healthy workers

Transaction Safety#

etcd transactions ensure atomic operations:

// Atomic create-if-not-exists
let txn = Txn::new()
    .when([Compare::create_revision(key, CompareOp::Equal, 0)])
    .and_then([Op::put(key, value, options)]);

etcd_client.txn(txn).await?;

This prevents race conditions in concurrent service registration.

Operational Modes#

Kubernetes Mode (Requires Explicit Configuration)#

Native Kubernetes service discovery:

# Operator explicitly sets this (not auto-detected):
export DYN_DISCOVERY_BACKEND=kubernetes

# Workers register via K8s CRDs
python -m dynamo.vllm --model Qwen/Qwen3-0.6B

# Frontend discovers workers via K8s API
python -m dynamo.frontend

No etcd or NATS required for basic operation when using K8s discovery.

KV Store Mode (Global Default)#

Full service discovery with etcd:

# This is the default - no configuration needed
# export DYN_DISCOVERY_BACKEND=kv_store  # (implicit)

# Workers register with etcd
python -m dynamo.vllm --model Qwen/Qwen3-0.6B

# Frontend discovers workers via etcd
python -m dynamo.frontend

KV-Aware Routing (Optional)#

Enable NATS for KV cache event tracking:

# Default: KV events enabled (requires NATS)
python -m dynamo.frontend --router-mode kv

# Disable KV events for prediction-based routing (no NATS)
python -m dynamo.frontend --router-mode kv --no-kv-events

With --no-kv-events:

Router predicts cache state based on routing decisions
TTL-based expiration and LRU pruning
No NATS infrastructure required

Best Practices#

1. Use Kubernetes Discovery on K8s#

The Dynamo operator automatically sets DYN_DISCOVERY_BACKEND=kubernetes for pods. No additional setup required when using the operator.

2. For Bare Metal: Deploy etcd Cluster#

For bare-metal production deployments, deploy a 3-node etcd cluster for high availability.

3. Configure Appropriate TTLs (etcd mode)#

Balance between detection speed and overhead:

Short TTL (5s): Faster failure detection, more keep-alive traffic
Long TTL (30s): Less overhead, slower detection

4. KV Routing Without NATS#

For simpler deployments without NATS:

# Use prediction-based KV routing
python -m dynamo.frontend --router-mode kv --no-kv-events

This provides KV-aware routing with reduced accuracy but no NATS dependency.

Event Plane Architecture#

Overview#

Kubernetes-Native Service Discovery#

Configuration#

How It Works#

Benefits#

Environment Variables (Injected by Operator)#

etcd Architecture (Default for All Deployments)#

Connection Configuration#

Lease Management#

Service Discovery#

Discovery Queries#

Watch Functionality#

Distributed Locks#

NATS Architecture#

When NATS is Used#

Configuration#

Disabling NATS#

Event Publishing#

Event Subscription#

JetStream Persistence#

Key-Value Store Abstraction#

Supported Backends#

Store Interface#

Buckets#

Typed Prefix Watcher#

Reliability Features#

Connection Resilience#

Lease-Based Cleanup#

Transaction Safety#

Operational Modes#

Kubernetes Mode (Requires Explicit Configuration)#

KV Store Mode (Global Default)#

KV-Aware Routing (Optional)#

Best Practices#

1. Use Kubernetes Discovery on K8s#

2. For Bare Metal: Deploy etcd Cluster#

3. Configure Appropriate TTLs (etcd mode)#

4. KV Routing Without NATS#

Related Documentation#