Dynamo Kubernetes Platform#
Deploy and manage Dynamo inference graphs on Kubernetes with automated orchestration and scaling, using the Dynamo Kubernetes Platform.
Quick Start Paths#
Path A: Production Install Install from published artifacts on your existing cluster → Jump to Path A
Path B: Local Development Set up Minikube first → Minikube Setup → Then follow Path A
Path C: Custom Development Build from source for customization → Jump to Path C
Prerequisites#
# Required tools
kubectl version --client # v1.24+
helm version # v3.0+
docker version # Running daemon
# Set your inference runtime image
export DYNAMO_IMAGE=nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.4.1
# Also available: sglang-runtime, tensorrtllm-runtime
Tip
No cluster? See Minikube Setup for local development.
Path A: Production Install#
Install from NGC published artifacts in 3 steps.
# 1. Set environment
export NAMESPACE=dynamo-kubernetes
export RELEASE_VERSION=0.4.1 # any version of Dynamo 0.3.2+
# 2. Install CRDs
helm fetch https://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-crds-${RELEASE_VERSION}.tgz
helm install dynamo-crds dynamo-crds-${RELEASE_VERSION}.tgz --namespace default
# 3. Install Platform
kubectl create namespace ${NAMESPACE}
helm fetch https://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-platform-${RELEASE_VERSION}.tgz
helm install dynamo-platform dynamo-platform-${RELEASE_VERSION}.tgz --namespace ${NAMESPACE}
Path C: Custom Development#
Build and deploy from source for customization.
Quick Deploy Script#
# 1. Set environment
export NAMESPACE=dynamo-cloud
export DOCKER_SERVER=nvcr.io/nvidia/ai-dynamo/ # or your registry
export DOCKER_USERNAME='$oauthtoken'
export DOCKER_PASSWORD=<YOUR_NGC_CLI_API_KEY>
export IMAGE_TAG=0.4.1
# 2. Build operator
cd deploy/cloud/operator
earthly --push +docker --DOCKER_SERVER=$DOCKER_SERVER --IMAGE_TAG=$IMAGE_TAG
cd -
# 3. Create namespace and secrets
kubectl create namespace ${NAMESPACE}
kubectl create secret docker-registry docker-imagepullsecret \
--docker-server=${DOCKER_SERVER} \
--docker-username=${DOCKER_USERNAME} \
--docker-password=${DOCKER_PASSWORD} \
--namespace=${NAMESPACE}
# 4. Deploy
helm repo add bitnami https://charts.bitnami.com/bitnami
./deploy.sh --crds
Manual Steps (Alternative)#
Click to expand manual installation steps
Step 1: Install CRDs
helm install dynamo-crds ./crds/ --namespace default
Step 2: Install Platform
helm dep build ./platform/
helm install dynamo-platform ./platform/ \
--namespace ${NAMESPACE} \
--set "dynamo-operator.controllerManager.manager.image.repository=${DOCKER_SERVER}/dynamo-operator" \
--set "dynamo-operator.controllerManager.manager.image.tag=${IMAGE_TAG}" \
--set "dynamo-operator.imagePullSecrets[0].name=docker-imagepullsecret"
Verify Installation#
# Check CRDs
kubectl get crd | grep dynamo
# Check operator and platform pods
kubectl get pods -n ${NAMESPACE}
# Expected: dynamo-operator-* and etcd-* pods Running
Next Steps#
Deploy Model/Workflow
# Example: Deploy a vLLM workflow with Qwen3-0.6B using aggregated serving kubectl apply -f components/backends/vllm/deploy/agg.yaml -n ${NAMESPACE} # Port forward and test kubectl port-forward svc/agg-vllm-frontend 8000:8000 -n ${NAMESPACE} curl http://localhost:8000/v1/models
Explore Backend Guides
vLLM Deployments
SGLang Deployments
TensorRT-LLM Deployments
Optional:
SLA Planner Deployment Guide (for advanced SLA-aware scheduling and autoscaling)
Troubleshooting#
Pods not starting?
kubectl describe pod <pod-name> -n ${NAMESPACE}
kubectl logs <pod-name> -n ${NAMESPACE}
HuggingFace model access?
kubectl create secret generic hf-token-secret \
--from-literal=HF_TOKEN=${HF_TOKEN} \
-n ${NAMESPACE}
Clean uninstall?
./uninstall.sh # Removes all CRDs and platform