Dynamo (Experimental)#
Warning
This feature is experimental and is not fully supported. It is included here as a preview for testing environments and is not recommended for production use cases. There may be changes to functionality, implementation, and APIs in future releases.
Dynamo is NVIDIA’s AI inference platform that provides an OpenAI-compatible frontend for running large language models (LLMs) with multiple backend options. It’s designed to simplify the deployment and management of AI models for inference workloads. Refer to the Dynamo documentation for more details.
Using the NIM Operator, you can deploy Dynamo Custom Resource Definitions (CRDs) when the dynamo.enabled=true flag is set in the NIM Operator Helm chart.
Note
Currently, the feature is supported on any upstream-compatible Kubernetes deployment. OpenShift support is planned for a future release.
Steps to deploy with Dynamo:
Prerequisites#
The following are prerequisites for running Dynamo sample manifests.
Create a target namespace for Dynamo deployments, like
dynamo-examples.$ kubectl create namespace dynamo-examples
Create an image pull secret and NIM authentication secret in your target namespace using your NGC API token. The examples below use
dynamo-examples. Refer to NGC Setup for more details on obtaining your NGC API key.Create image pull secret:
$ kubectl create secret docker-registry ngc-secret \ --docker-server=nvcr.io \ --docker-username='$oauthtoken' \ --docker-password="<your-NGC-API-key>" \ -n dynamo-examples
Create NIM authentication secret
$ kubectl create secret generic ngc-api-secret \ --from-literal=NGC_API_KEY="<your-NGC-API-key>" \ -n dynamo-examples
Create a Hugging Face token. Generate a Read token at Hugging Face Settings, then create the secret in your target namespace. The example below uses
dynamo-examples.$ kubectl create secret generic hf-token-secret \ --from-literal=HF_TOKEN="<your-huggingface-token>" \ -n dynamo-examples
Configure NIM Operator with Dynamo Enabled#
The following steps will install or update the NIM Operator with Dynamo enabled. The NIM Operator will configure the Dynamo CRDs and deploy Dynamo management pods on your cluster.
Note
The NIM Operator will deploy Dynamo CRDs v0.9.0.
Add NGC Repository.
$ helm repo add nvidia https://helm.ngc.nvidia.com/nvidia $ helm repo update
Install Operator.
$ helm upgrade --install nim-operator \ nvidia/k8s-nim-operator -n nim-operator --create-namespace \ --version=3.1.0 --set dynamo.enabled=true
The NIM Operator supports configuring Dynamo values including
etcd,nats,dynamo-operator,grove, andkai-scheduler. Refer to the dynamo-platform chart for more details on these configuration settings.For example, you could enable Dynamo and override grove and kai-scheduler by including
--set dynamo.grove.enabled=trueand--set dynamo.kai-scheduler.enabled=truein the above Helm command.Alternatively, you can configure Dynamo using a values file. Use a top-level
dynamo:key so the NIM Operator chart passes them to the Dynamo subchart. The following is an examplevalues.yamlfile withdynamoconfiguration options.dynamo: enabled: true grove: enabled: true kai-scheduler: enabled: true # Add other dynamo-platform keys as needed
Update the
nim-operator-dynamo-operator-controller-managerDeployment to use a different kube-rbac-proxy image. The default image repository in the Dynamo v0.9.0 Helm chart has been deprecated.$ kubectl set image deployment/nim-operator-dynamo-operator-controller-manager \ kube-rbac-proxy=registry.k8s.io/kubebuilder/kube-rbac-proxy:v0.16.0 \ -n nim-operator
Verify Dynamo CRDs and NIM Operator#
Optional: You can verify that Dynamo is available on your cluster.
Validate that Dynamo CRDs were installed on your cluster.
$ kubectl get crd | grep -E dynamo
Example output:
dynamocomponentdeployments.nvidia.com 2026-03-11T17:22:11Z dynamographdeploymentrequests.nvidia.com 2026-03-11T17:22:11Z dynamographdeployments.nvidia.com 2026-03-11T17:22:11Z dynamographdeploymentscalingadapters.nvidia.com 2026-03-11T17:22:11Z dynamomodels.nvidia.com 2026-03-11T17:22:11Z dynamoworkermetadatas.nvidia.com 2026-03-11T17:22:11Z
Validate that the Dynamo pods are running on your cluster.
$ kubectl get pods -n nim-operator
Example output for a default Dynamo deployment without optional configuration settings:
NAME READY STATUS RESTARTS AGE nim-operator-dynamo-operator-controller-manager-6f8575d6986zxbm 2/2 Running 0 24h nim-operator-dynamo-operator-webhook-ca-inject-1-ttfqd 0/1 Completed 0 24h nim-operator-dynamo-operator-webhook-cert-gen-1-b9j7c 0/1 Completed 0 24h nim-operator-etcd-0 1/1 Running 0 24h nim-operator-k8s-nim-operator-54dc697cc7-nbh2b 1/1 Running 0 24h nim-operator-nats-0 2/2 Running 0 24h
Dynamo vLLM Deployment (Qwen Example)#
The Dynamo repo has several example manifests available. Refer to the Dynamo v0.9.0 example folder for detailed examples. The following example manifest uses Qwen/Qwen3-0.6B and is based off the Dynamo agg.yaml example manifest.
To run a Dynamo example manifest with the NIM Operator, update the manifest to include extraPodSpec.imagePullSecret.name: ngc-secret to pull images from NGC (nvcr.io).
Steps for creating the ngc-secret and hf-token-secret (already included in examples) secrets can be found in the prerequisites section.
Create a spec like the following example, called
agg.yaml.1apiVersion: nvidia.com/v1alpha1 2kind: DynamoGraphDeployment 3metadata: 4 name: vllm-agg 5 namespace: dynamo-examples 6spec: 7 services: 8 Frontend: 9 componentType: frontend 10 replicas: 1 11 extraPodSpec: 12 imagePullSecrets: 13 - name: ngc-secret # ADDED: For NGC image pull 14 mainContainer: 15 image: nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.9.0 16 VllmDecodeWorker: 17 envFromSecret: hf-token-secret 18 componentType: worker 19 replicas: 1 20 resources: 21 limits: 22 gpu: "1" 23 extraPodSpec: 24 imagePullSecrets: 25 - name: ngc-secret # ADDED: For NGC image pull 26 mainContainer: 27 image: nvcr.io/nvidia/ai-dynamo/vllm-runtime:0.9.0 28 workingDir: /workspace/examples/backends/vllm 29 command: ["python3", "-m", "dynamo.vllm"] 30 args: ["--model", "Qwen/Qwen3-0.6B"]
Apply the manifest.
$ kubectl apply -f agg.yaml -n dynamo-examples
It may take a few minutes to pull the images and the pods to start.
Optional: Check that pods are running.
$ kubectl get pods -n dynamo-examples
Testing the Deployment#
Create an example manifest,
curl.yaml, like the following example:apiVersion: v1 kind: Pod metadata: name: curl labels: app: curl namespace: dynamo-examples spec: containers: - name: curl image: curlimages/curl:7.83.1 imagePullPolicy: IfNotPresent command: - tail - -f - /dev/null restartPolicy: Never
Deploy a test pod.
$ kubectl apply -f curl.yaml -n dynamo-examples
Execute inference request using one of the following methods.
Standard Completion:
$ kubectl exec curl -n dynamo-examples -- curl -i \ http://vllm-agg-frontend:8000/v1/completions \ -H 'Content-Type: application/json' \ -d '{"model": "Qwen/Qwen3-0.6B","prompt": "Write as if you were a critic: San Francisco","max_tokens": 100,"temperature": 0}'
Chat Completion:
$ kubectl exec curl -n dynamo-examples -- curl -i \ http://vllm-agg-frontend:8000/v1/chat/completions \ -H 'Content-Type: application/json' \ -d '{"model": "Qwen/Qwen3-0.6B","messages": [{"role": "user","content": "Write as if you were a critic: San Francisco"}],"max_tokens": 100,"temperature": 0}'