Deploy Boltz2 with Helm#
Boltz2 NIMs are designed to run on systems with NVIDIA GPUs. To deploy using Helm, you must have a Kubernetes cluster with appropriate GPU nodes and the GPU Operator installed.
Prerequisites#
If you have not set up your NGC API key, refer to Getting Started first.
Required Components#
Configured Kubernetes cluster with GPU nodes
NVIDIA GPU Operator installed
Helm 3.x or later
kubectl configured to access your cluster
NGC API key with access to Boltz2 NIM
Download the Helm Chart#
The Boltz2 NIM Helm chart is available from the NGC Catalog. Use the following command to download the chart:
helm fetch https://helm.ngc.nvidia.com/nim/mit/charts/boltz2-nim-<version>.tgz --username='$oauthtoken' --password=<YOUR API KEY>
Replace <version> with the desired chart version. In most cases, you should use the latest version available at:
https://catalog.ngc.nvidia.com/orgs/nim/teams/mit/helm-charts/boltz2-nim/files
Configure Helm#
The following Helm options are the most important to configure for deploying Boltz2 NIM:
Core Configuration Options#
boltz2.repository– The Boltz2 NIM container to deployboltz2.tag– The version of the containerboltz2.servicePort– The port the Kubernetes service listens on (default:8081)boltz2.structureOptimizedBackend– Backend for structure prediction: “trt” (default) or “pytorch”boltz2.affinityOptimizedBackend– Backend for affinity binding: “trt” (default) or “pytorch”boltz2.enableDiffusionTF32– Enable TensorFloat-32: “0” or “1” (default: “1”)Storage options – Based on your cluster environment (see Storage section)
ngc.apiSecretandimagePullSecrets– For NGC authenticationresources– GPU and memory requirements
To view all available configuration options, run:
helm show readme boltz2-nim-<version>.tgz | less
To examine default values:
helm show values boltz2-nim-<version>.tgz
Quick Start Deployment#
1. Create NGC Secrets#
Create the necessary Kubernetes secrets for NGC authentication:
# Image pull secret
kubectl create secret docker-registry ngc-secret \
--docker-server=nvcr.io \
--docker-username='$oauthtoken' \
--docker-password=$NGC_API_KEY
# NGC API key secret
kubectl create secret generic ngc-api \
--from-literal=NGC_API_KEY=$NGC_API_KEY
2. Create Values File#
Create a file named custom-values.yaml with your configuration:
boltz2:
repository: "nvcr.io/nim/mit/boltz2"
tag: "1.5.0"
structureOptimizedBackend: "trt"
affinityOptimizedBackend: "trt"
enableDiffusionTF32: "1"
ngc:
apiSecret: ngc-api
persistence:
enabled: true
size: 100Gi
resources:
limits:
nvidia.com/gpu: 1
memory: 64Gi
requests:
memory: 32Gi
imagePullSecrets:
- name: ngc-secret
3. Install the Chart#
Deploy Boltz2 NIM using Helm:
export CHART_NAME=boltz2-nim
helm install "${CHART_NAME}" boltz2-nim-<version>.tgz -f custom-values.yaml
4. Verify Deployment#
Check the pod status:
kubectl get pods
View logs to monitor startup progress:
kubectl logs -f <pod-name>
Note
Due to large model files, pods may take several minutes to start. The NIM will download model weights on first startup if they are not already present in the persistent volume.
Storage#
Proper storage configuration is critical for Boltz2 NIM deployment. The model cache can be large, and you have several options:
Persistent Volume Claims (Recommended)#
Enabled with persistence.enabled: true. The chart supports the following persistence options:
Field |
Default |
Description |
|---|---|---|
|
|
Enable persistent storage for the model cache |
|
|
Use an existing PVC by name; if set, no new PVC is created |
|
|
StorageClass for dynamic provisioning; use |
|
|
Access mode for the volume |
|
|
Host path when using hostPath (development) |
|
|
Size of the volume; increase for large model caches |
|
|
Adjust volume permissions on startup |
Example using chart defaults:
persistence:
enabled: true
existingClaim: ""
storageClass: standard
accessMode: ReadWriteOnce
hostPath: /data/nim
size: 10Gi
fixPermissions: true
To use the cluster default StorageClass, set storageClass: "". For production with a large model cache, increase size (e.g. 100Gi):
persistence:
enabled: true
storageClass: "" # Use default storage class
size: 100Gi
For multi-pod scenarios, use a storage class that supports ReadWriteMany:
persistence:
enabled: true
storageClass: "nfs-client" # Example: NFS storage class
accessMode: ReadWriteMany
size: 100Gi
Alternative Storage Options#
NFS Storage#
The chart does not support direct NFS volume mounts. For NFS-backed storage, use a StorageClass that provisions NFS volumes and enable persistence:
persistence:
enabled: true
storageClass: "nfs-client" # Or your cluster's NFS provisioner
accessMode: ReadWriteMany
size: 100Gi
HostPath (development)#
The chart’s PV template uses persistence.hostPath as the node path for the PersistentVolume (default: /data/nim). To use a different path on the node:
persistence:
enabled: true
hostPath: /data/nim # or e.g. /model-store — must exist on the node
storageClass: standard
size: 10Gi
Warning
HostPath ties the volume to a specific node and has security implications. Prefer a StorageClass for multi-node or production use.
GPU Configuration#
Single GPU Deployment#
Most Boltz2 deployments use a single GPU:
resources:
limits:
nvidia.com/gpu: 1
GPU Selection#
Use node selectors to target specific GPU types:
nodeSelector:
nvidia.com/gpu.product: "NVIDIA-H100-80GB-HBM3"
Or use tolerations for GPU taints:
tolerations:
- key: nvidia.com/gpu
operator: Exists
effect: NoSchedule
Service Configuration#
Basic Service#
The default service configuration exposes the Boltz2 API. The chart provides:
service.type– Service type (default:ClusterIP). This is the only option under theservicekey.boltz2.servicePort– The port the service listens on (default:8081). To use a different port, set this under theboltz2section of your values.Service name – Generated from the Helm release name by the chart’s templates and cannot be overridden.
Example values to change the service port:
service:
type: ClusterIP
boltz2:
servicePort: 9090 # Default is 8081
Port Forwarding (Testing)#
For local testing, use port forwarding. Use the same port as boltz2.servicePort (default 8081) on the right side of the mapping:
kubectl port-forward service/"${CHART_NAME}" 8080:8081
If you set boltz2.servicePort to a different value (e.g. 9090), use that port instead:
kubectl port-forward service/"${CHART_NAME}" 8080:9090
Test the API:
curl http://localhost:8080/v1/health/live
Ingress Configuration#
For production deployments, configure ingress:
ingress:
enabled: true
className: nginx
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
hosts:
- host: boltz2.example.com
paths:
- path: /
pathType: Prefix
tls:
- secretName: boltz2-tls
hosts:
- boltz2.example.com
Advanced Configuration#
Autoscaling#
The chart supports horizontal pod autoscaling based on CPU utilization (GPU-based autoscaling is not supported):
autoscaling:
enabled: true
minReplicas: 1
maxReplicas: 100
targetCPUUtilizationPercentage: 80
If your chart version supports it, you can also set targetMemoryUtilizationPercentage for memory-based scaling.
Custom Probes#
The chart configures liveness and readiness probes under probes.liveness and probes.readiness. Probe paths are fixed in the template (/v1/health/live and /v1/health/ready). There is no startupProbe or enabled/path fields. Adjust timings for large models or slow storage:
probes:
liveness:
initialDelaySeconds: 300
periodSeconds: 10
timeoutSeconds: 1
successThreshold: 1
failureThreshold: 3
readiness:
initialDelaySeconds: 300
periodSeconds: 10
timeoutSeconds: 1
successThreshold: 1
failureThreshold: 3
Increase initialDelaySeconds or failureThreshold if pods need more time to become ready.
Security Context#
Configure pod and container security (chart defaults):
podSecurityContext:
runAsUser: 1000
runAsGroup: 1000
fsGroup: 1000
securityContext:
runAsUser: 1000
runAsGroup: 1000
allowPrivilegeEscalation: false
The chart uses securityContext for the container; there is no containerSecurityContext or capabilities in the default values.
Run Inference#
Once deployed, you can run predictions using the API:
curl -X POST http://localhost:8080/biology/mit/boltz2/predict \
-H "Content-Type: application/json" \
-d '{
"polymers": [
{
"molecule_type": "protein",
"sequence": "MKTAYIAKQRQISFVKSHFSRQLEERLGLIEVQAPILSRVGDGTQDNLSGAEKAVQVKVKALPDAQFEVVHSLAKWKRQTLGQHDFSAGEGLYTHMKALRPDEDRLSPLHSVYVDQWDWERVMGDGERQFSTLKSTVEAIWAGIKATEAAVSEEFGLAPFLPDQIHFVHSQELLSRYPDLDAKGRERAIAKDLGAVFLVGIGGKLSDGHRHDVRAPDYDDWSTPSELGHAGLNGDILVWNPVLEDAFELSSMGIRVDADTLKHQLALTGDEDRLELEWHQALLRGEMPQTIGGGIGQSRLTMLLLQLPHIGQVQAGVWPAAVRESVPSLL"
}
],
"diffusion_samples": 1,
"recycling_steps": 3
}'
For detailed API documentation, refer to Inference.
Troubleshooting#
Pod Stuck in Pending#
Check events:
kubectl describe pod <pod-name>
Common issues:
Insufficient GPU resources
Node taints requiring tolerations
Storage mount failures
Image pull errors
Pod Fails to Start#
View logs:
kubectl logs <pod-name>
If the pod needs more time to become ready, increase probe delays or failure thresholds under probes (the chart does not support a startup probe):
probes:
liveness:
initialDelaySeconds: 600 # Allow more time before first check
failureThreshold: 10
readiness:
initialDelaySeconds: 600
failureThreshold: 10
Storage Issues#
Verify PVC status:
kubectl get pvc
kubectl describe pvc <pvc-name>
Ensure storage class supports the required access mode.
GPU Access#
Test GPU availability:
kubectl run gpu-test \
--image=nvidia/cuda:12.6.2-base-ubuntu22.04 \
--restart=Never \
--command -- nvidia-smi
Uninstall#
To remove the Boltz2 NIM deployment:
helm uninstall "${CHART_NAME}"
To also remove persistent volumes:
kubectl delete pvc -l app.kubernetes.io/instance="${CHART_NAME}"