Developing the Operator with Tilt
Fast, live-reload development loop for the Dynamo Kubernetes operator
Fast, live-reload development loop for the Dynamo Kubernetes operator
Tilt provides a live-reload development environment for the Dynamo Kubernetes operator. Instead of manually building images, pushing to a registry, and redeploying on every change, Tilt watches your source files and automatically recompiles the Go binary, syncs it into the running container, and restarts the process — all in seconds.
Under the hood, the Tiltfile:
CGO_ENABLED=0).deploy/helm/charts/platform) with
helm template, applies CRDs via kubectl, and deploys all rendered
resources.This gives you a fully working cluster where you can apply DynamoGraphDeployment
and DynamoGraphDeploymentRequest resources and have them reconcile into real
workloads — while iterating on controller logic with sub-second feedback.
You also need a container registry that is accessible to your cluster’s nodes, so they can pull the operator image Tilt builds. If you use a local cluster like kind with a local registry, Tilt can push there directly.
Tilt opens a terminal UI and a web dashboard at http://localhost:10350. The dashboard shows resource status, build logs, and port-forwards.
Press Space in the terminal to open the web UI. Press Ctrl-C to
shut everything down (resources remain deployed; run tilt down to tear
them down).

All configuration is optional. The Tiltfile defines sensible defaults for every
setting, and tilt-settings.yaml is gitignored so your personal values
(cluster context, registry, etc.) never leak into the repo.
Create deploy/operator/tilt-settings.yaml with any of the settings below:
The operator image needs to be pullable by your cluster’s nodes. The registry is resolved in priority order:
REGISTRY env var — REGISTRY=docker.io/myuser tilt upregistry in tilt-settings.yamlThe image is pushed as {registry}/controller:tilt-dev.
If no registry is configured, the image is only available locally. This works with kind using a local registry but will fail on remote clusters.
When you run tilt up, the following resources are created in order:
The operator handles webhook certificate generation, CA bundle injection, and MPI SSH key provisioning at runtime — no external setup needed.
manager-build — Runs CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build to
compile the operator binary. Re-runs on changes to api/, cmd/, internal/,
go.mod, or go.sum.
crds — Applies CRDs from the Helm chart via kubectl apply --server-side.
When skip_codegen is false, runs make generate && make manifests first.
operator — The operator Deployment itself. Tilt watches the compiled binary
and uses live_update to sync it into the running container and restart the
process — no image rebuild needed. On startup, the operator’s built-in cert
controller generates a self-signed TLS certificate, injects the CA bundle into
webhook configurations, and creates the MPI SSH secret — matching production
behavior exactly.
The inner development loop looks like this:
deploy/operator/.live_update.No docker build, no docker push, no kubectl rollout restart.
The operator handles webhook TLS certificates automatically at runtime using a built-in cert controller (based on OPA cert-controller). On startup it:
webhook-server-cert Secret.ValidatingWebhookConfiguration and
MutatingWebhookConfiguration resources.This matches production behavior and requires no external tooling. For
alternative certificate management (cert-manager or external certs), see the
webhook documentation and configure via
helm_values in tilt-settings.yaml.
The most common workflow — you’re modifying reconciliation logic and want fast feedback:
When you modify files under api/, you need codegen to run:
Tilt will run make generate && make manifests and re-apply CRDs whenever
api/ files change.
Enable the necessary subcharts:
You can override the registry without editing the settings file:
The web UI at http://localhost:10350 shows:
localhost:8081Resources are grouped by label (operator and infrastructure) to keep the
UI organized.
If pods show ImagePullBackOff:
registry is set in tilt-settings.yaml or via REGISTRY env var.If applying a DGD/DGDR fails with x509: certificate signed by unknown authority:
webhook-server-cert Secret exists and has been populated:
cert-controller log messages before
applying resources.If crds fails with codegen errors:
controller-gen is installed: make controller-genmake generate && make manifestsskip_codegen: true temporarily to bypass if you haven’t changed API types.If Tilt refuses to start with a context error, add your cluster context to
allowed_contexts in tilt-settings.yaml: