Dynamo Kubernetes Platform#

Deploy and manage Dynamo inference graphs on Kubernetes with automated orchestration and scaling, using the Dynamo Kubernetes Platform.

Quick Start Paths#

Path A: Production Install Install from published artifacts on your existing cluster → Jump to Path A

Path B: Local Development Set up Minikube first → Minikube Setup → Then follow Path A

Path C: Custom Development Build from source for customization → Jump to Path C

Prerequisites#

# Required tools
kubectl version --client  # v1.24+
helm version             # v3.0+
docker version           # Running daemon

# Set your inference runtime image
export DYNAMO_IMAGE=nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.4.1
# Also available: sglang-runtime, tensorrtllm-runtime

Tip

No cluster? See Minikube Setup for local development.

Path A: Production Install#

Install from NGC published artifacts in 3 steps.

# 1. Set environment
export NAMESPACE=dynamo-kubernetes
export RELEASE_VERSION=0.4.1 # any version of Dynamo 0.3.2+

# 2. Install CRDs
helm fetch https://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-crds-${RELEASE_VERSION}.tgz
helm install dynamo-crds dynamo-crds-${RELEASE_VERSION}.tgz --namespace default

# 3. Install Platform
kubectl create namespace ${NAMESPACE}
helm fetch https://helm.ngc.nvidia.com/nvidia/ai-dynamo/charts/dynamo-platform-${RELEASE_VERSION}.tgz
helm install dynamo-platform dynamo-platform-${RELEASE_VERSION}.tgz --namespace ${NAMESPACE}

→ Verify Installation

Path C: Custom Development#

Build and deploy from source for customization.

Quick Deploy Script#

# 1. Set environment
export NAMESPACE=dynamo-cloud
export DOCKER_SERVER=nvcr.io/nvidia/ai-dynamo/  # or your registry
export DOCKER_USERNAME='$oauthtoken'
export DOCKER_PASSWORD=<YOUR_NGC_CLI_API_KEY>
export IMAGE_TAG=0.4.1

# 2. Build operator
cd deploy/cloud/operator
earthly --push +docker --DOCKER_SERVER=$DOCKER_SERVER --IMAGE_TAG=$IMAGE_TAG
cd -

# 3. Create namespace and secrets
kubectl create namespace ${NAMESPACE}
kubectl create secret docker-registry docker-imagepullsecret \
  --docker-server=${DOCKER_SERVER} \
  --docker-username=${DOCKER_USERNAME} \
  --docker-password=${DOCKER_PASSWORD} \
  --namespace=${NAMESPACE}

# 4. Deploy
helm repo add bitnami https://charts.bitnami.com/bitnami
./deploy.sh --crds

Manual Steps (Alternative)#

Click to expand manual installation steps

Step 1: Install CRDs

helm install dynamo-crds ./crds/ --namespace default

Step 2: Install Platform

helm dep build ./platform/
helm install dynamo-platform ./platform/ \
  --namespace ${NAMESPACE} \
  --set "dynamo-operator.controllerManager.manager.image.repository=${DOCKER_SERVER}/dynamo-operator" \
  --set "dynamo-operator.controllerManager.manager.image.tag=${IMAGE_TAG}" \
  --set "dynamo-operator.imagePullSecrets[0].name=docker-imagepullsecret"

→ Verify Installation

Verify Installation#

# Check CRDs
kubectl get crd | grep dynamo

# Check operator and platform pods
kubectl get pods -n ${NAMESPACE}
# Expected: dynamo-operator-* and etcd-* pods Running

Next Steps#

Deploy Model/Workflow

# Example: Deploy a vLLM workflow with Qwen3-0.6B using aggregated serving
kubectl apply -f components/backends/vllm/deploy/agg.yaml -n ${NAMESPACE}

# Port forward and test
kubectl port-forward svc/agg-vllm-frontend 8000:8000 -n ${NAMESPACE}
curl http://localhost:8000/v1/models

Explore Backend Guides
- vLLM Deployments
- SGLang Deployments
- TensorRT-LLM Deployments
Optional:
- Set up Prometheus & Grafana
- SLA Planner Deployment Guide (for advanced SLA-aware scheduling and autoscaling)

Troubleshooting#

Pods not starting?

kubectl describe pod <pod-name> -n ${NAMESPACE}
kubectl logs <pod-name> -n ${NAMESPACE}

HuggingFace model access?

kubectl create secret generic hf-token-secret \
  --from-literal=HF_TOKEN=${HF_TOKEN} \
  -n ${NAMESPACE}

Clean uninstall?

./uninstall.sh  # Removes all CRDs and platform