DynamoModel is a Kubernetes Custom Resource that represents a machine learning model deployed on Dynamo. It enables you to:
DynamoModel works alongside DynamoGraphDeployment (DGD) or DynamoComponentDeployment (DCD) resources. While DGD/DCD deploy the inference infrastructure (pods, services), DynamoModel handles model-specific operations like loading LoRA adapters.
Before creating a DynamoModel, you need:
DynamoGraphDeployment or DynamoComponentDeploymentmodelRef pointing to your base modelFor complete setup including DGD configuration, see Integration with DynamoGraphDeployment.
1. Create your DynamoModel:
2. Apply and verify:
Expected output:
Thatās it! The operator automatically discovers endpoints and loads the LoRA.
For detailed status monitoring, see Monitoring & Operations.
DynamoModel supports three model types:
Most users will use lora to deploy fine-tuned models on top of their base model deployments.
When you create a DynamoModel, the operator:
baseModelName (by matching modelRef.name in DGD/DCD)lora type)Key linkage:
DynamoModel requires just a few key fields to deploy a model or adapter:
Example minimal LoRA configuration:
For complete field specifications, validation rules, and all options, see: š DynamoModel API Reference
The status shows discovered endpoints and their readiness:
Key status fields:
totalEndpoints / readyEndpoints: Counts of discovered vs ready endpointsendpoints[]: List with addresses, pod names, and ready statusconditions: Standard Kubernetes conditions (EndpointsReady, ServicesFound)For detailed status usage, see the Monitoring & Operations section below
Deploy a LoRA adapter stored in an S3 bucket.
Prerequisites:
meta-llama/Llama-3.3-70B-Instruct running via DGD/DCDVerification:
Deploy a LoRA adapter from HuggingFace Hub.
Prerequisites:
Qwen/Qwen3-0.6B running via DGD/DCDWith HuggingFace token:
Deploy multiple LoRA adapters on the same base model deployment.
Both LoRAs will be loaded on all pods serving Qwen/Qwen3-0.6B. Your application can then route requests to the appropriate adapter.
Quick status check:
Example output:
Detailed status:
Example output:
An endpoint is ready when:
Condition states:
EndpointsReady=True: All endpoints are ready (full availability)EndpointsReady=False, Reason=NotReady: Not all endpoints ready (check message for counts)EndpointsReady=False, Reason=NoEndpoints: No endpoints foundWhen readyEndpoints < totalEndpoints, the operator automatically retries loading every 30 seconds.
Get endpoint addresses:
Output:
Get endpoint pod names:
Check readiness of each endpoint:
Output:
To update a LoRA (e.g., deploy a new version):
The operator will detect the change and reload the LoRA on all endpoints.
For LoRA models, the operator will:
The base model deployment (DGD/DCD) continues running normally.
Symptom:
Common Causes:
Base model deployment not running
Solution: Deploy your DGD/DCD first, wait for pods to be ready.
baseModelName mismatch
Solution: Ensure baseModelName in DynamoModel exactly matches modelRef.name in DGD.
Pods not ready
Solution: Wait for pods to reach Running and Ready state.
Wrong namespace Solution: Ensure DynamoModel is in the same namespace as your DGD/DCD.
Symptom:
Common Causes:
Source URI not accessible
Solution:
Invalid LoRA format Solution: Ensure your LoRA weights are in the format expected by your backend framework (SGLang, vLLM, etc.)
Endpoint API errors
Solution: Check the backend frameworkās logs in the worker pods:
Out of memory Solution: LoRA adapters require additional memory. Increase memory limits in your DGD:
Symptom: Some endpoints remain not ready for extended periods.
Diagnosis:
Common Causes:
When to wait vs investigate:
Check events:
View operator logs:
Common events and messages:
This section shows the complete end-to-end workflow for deploying base models and LoRA adapters together.
DynamoModel and DynamoGraphDeployment work together to provide complete model deployment:
The connection is established through the modelRef field in your DGD:
Complete example:
Recommended order:
What happens behind the scenes:
The operator automatically handles all service discovery - you donāt configure services, labels, or selectors manually.
For complete field specifications, validation rules, and detailed type definitions, see:
DynamoModel provides declarative model management for Dynamo deployments:
ā Simple: 2-step deployment of LoRA adapters ā Automatic: Endpoint discovery and loading handled by operator ā Observable: Rich status reporting and conditions ā Integrated: Works seamlessly with DynamoGraphDeployment
Next Steps: