Custom Backend Overview

Choose the right path for bringing your own engine to Dynamo
View as Markdown

Dynamo supports custom backends through one preferred unified contract, a lower-level worker path, and a packaging path:

PathUse when
Writing Unified BackendsYou are writing a new token-in-token-out engine in Python or Rust and want Dynamo to own the runtime lifecycle.
Python Workers (lower-level)You need the older register_model and serve_endpoint path for features the unified backend does not cover yet.
Runtime ContainersYou need to package a built-in or custom backend into a deployable Dynamo image.

The unified backend path is the preferred starting point for new custom engines. It gives Python and Rust backends the same lifecycle shape: parse arguments, start the engine, stream generated chunks, handle cancellation, drain, and clean up. The Dynamo framework owns runtime registration, signal handling, model registration, and graceful shutdown.

Use the lower-level Python worker path when your backend needs features that are still outside the unified contract, such as multimodal, LoRA adapter management, logprobs, guided decoding, engine-specific routes, or custom request handling.

If your custom engine wants KV-cache-aware routing, also implement KV Events for Custom Engines so the Dynamo router can track which workers hold each prefix.