Kubernetes Deployment

Use Dynamo’s Kubernetes-native path when you are ready to deploy on a GPU cluster.

View as Markdown

Use the Kubernetes guides when you are ready to move beyond a local Dynamo process and deploy on a GPU cluster. Dynamo’s Kubernetes path is native to the platform: inference graphs are expressed as Dynamo CRDs, reconciled by the Dynamo operator, installed with Helm, and integrated with Kubernetes service discovery, Gateway API Inference Extension, scheduling, observability, and model-loading workflows.

This does not make Kubernetes the only way to use Dynamo. Local containers, PyPI installs, and standalone components remain the right path for evaluation, development, and incremental adoption.

Start with the Kubernetes Quickstart to run one model end to end. Then use the rest of the Kubernetes Deployment section based on what you need next:

GoalGuide
Install the operator and prerequisitesInstallation Guide
Deploy and manage modelsDeployment Overview
Load models faster across podsModel Caching and ModelExpress
Operate a cluster deploymentAutoscaling, Rolling Update, Disagg Communication, and Observability Metrics
Scale disaggregated servingMultinode Deployments, Grove, and Topology Aware Scheduling
Integrate with Kubernetes serving APIsGateway API Inference Extension (GAIE) and LWS

If you are still evaluating Dynamo locally, start with the Quickstart and Local Installation first.