Quickstart
Get a model running on Kubernetes in minutes.
Prerequisites
- Kubernetes cluster (v1.24+) with GPU nodes
- kubectl (v1.24+)
- Helm (v3.0+) installed
- NVIDIA GPU Operator installed on the cluster
- HuggingFace token secret on cluster
HuggingFace token secret
Create a HuggingFace token secret for model downloads. If you don’t have a token, see the HuggingFace token guide.
GPU Operator quick install
If you don’t have the GPU Operator yet:
If your cluster already provides GPU drivers (e.g., GKE with gpu-driver-version=latest, or AKS), add:
Detailed installation
The GPU Operator is the only prerequisite for a basic deployment. For additional features like RDMA, Prometheus, or multinode scheduling with Grove/KAI Scheduler, see the Installation Guide.
If your GPU SKU and cloud provider are supported, you can use AICR for rapid installation of prerequisites and the Dynamo Helm chart.
Verify cluster is ready
Optionally, verify your cluster is ready:
Install Dynamo
Wait for the platform pods:
Deploy Your First Model
Deploy Qwen/Qwen3-0.6B using a DynamoGraphDeploymentRequest (DGDR).
The DGDR is the entrypoint for deploying models. It runs automatic profiling for your model/hardware and creates an auto-configured DynamoGraphDeployment (DGD). After that, the DGDR is completed and reaches a terminal state, similar to a K8s Job and can be cleaned up. The DGD is the resource that persists and serves your model.
Watch the DGDR progress from Pending → Profiling → Deploying → Deployed:
Dynamo supports vLLM, TensorRT-LLM, and SGLang backends. Setting backend: auto lets the profiler choose the best one for your model and hardware. See the backends guide for details.
Send a Request
Once the DGDR shows Deployed:
Cleanup
Next Steps
- Installation Guide — Cloud provider setup, GPU Operator details, optional components (Grove, RDMA, model caching, Prometheus)
- Model Deployment Guide — Strategy selection, model caching, planner, multinode, common pitfalls
- DGDR Reference — Spec reference, lifecycle phases, monitoring commands, DGDR vs DGD
- Creating Deployments — Hand-craft a DGD spec for full control