Kubernetes Deployment
Deploy the AICR API Server in your Kubernetes cluster for self-hosted recipe generation.
Overview
API Server deployment enables self-hosted recipe generation:
- Isolated deployment: Recipe data stays within your infrastructure
- Custom recipes: Modify embedded recipe data (see
recipes/) - High availability: Deploy multiple replicas with load balancing
- Observability: Prometheus
/metricsendpoint and structured logging
API Server scope:
- Recipe generation from query parameters (query mode)
- Does not capture snapshots (use agent Job or CLI)
- Generates bundles via
POST /v1/bundle - Does not analyze snapshots (query mode only)
Agent deployment (separate component):
- Kubernetes Job captures cluster configuration
- Writes snapshot to ConfigMap via Kubernetes API
- Requires RBAC: ServiceAccount with ConfigMap create/update permissions
- See Agent Deployment
Typical workflow:
- Deploy agent Job → Captures snapshot → Writes to ConfigMap
- CLI reads ConfigMap → Generates recipe → Writes to file or ConfigMap
- CLI reads recipe → Generates bundle → Writes to filesystem
- Apply bundle to cluster (Helm install, kubectl apply)
Quick Start
Helm chart: Not yet available. Use the manual manifests below.
Manual Deployment
1. Create Namespace
2. Create Deployment
3. Create Service
4. Create Ingress (Optional)
Agent Deployment
Deploy the AICR Agent as a Kubernetes Job to automatically capture cluster configuration.
1. Create RBAC Resources
2. Create Agent Job
Note: The agent defaults to privileged mode, which is required for GPU, SystemD, and OS collectors. For PSS-restricted namespaces where only the Kubernetes collector is needed, use
--privileged=falsewhen deploying via the CLI. See Agent Deployment for details.
3. Generate Recipe from ConfigMap
4. Generate Bundle
E2E Testing
Validate the complete workflow:
CLI tests use Kyverno Chainsaw for declarative YAML assertions. See tests/chainsaw/README.md for details.
Configuration Options
Environment Variables
Note: The API server uses structured JSON logging to stderr. The CLI supports three logging modes (CLI/Text/JSON), but the API server always uses JSON for consistent log aggregation.
ConfigMap for Custom Recipe Data (Advanced)
Note: This example shows the concept of mounting custom recipe data. The actual recipe format uses a base-plus-overlay architecture. See
recipes/for the current schema (overlays/*.yamlincludingbase.yaml).
Mount in deployment:
High Availability
Horizontal Pod Autoscaler
Pod Disruption Budget
Monitoring
Prometheus ServiceMonitor
Grafana Dashboard
Key panels:
- Request rate (by status code)
- Request duration (p50, p95, p99)
- Error rate
- Rate limit rejections
- Active connections
Security
Network Policies
Pod Security Standards
RBAC (If API server needs K8s access)
Troubleshooting
Check Pod Status
Check Service
Check Ingress
Performance Issues
Connection Refused
- Check service exists:
kubectl get svc -n aicr - Check endpoints:
kubectl get endpoints -n aicr - Check pod is ready:
kubectl get pods -n aicr - Check readiness probe:
kubectl describe pod -n aicr <pod-name>
Rate Limiting
Check rate limit settings:
Adjust via deployment:
Upgrading
Rolling Update
Blue-Green Deployment
Backup and Disaster Recovery
Export Configuration
Restore from Backup
Cost Optimization
Resource Limits
Start with minimal resources:
Monitor and adjust based on usage.
Vertical Pod Autoscaler (Optional)
See Also
- API Reference - API endpoint documentation
- Automation - CI/CD integration
- Data Flow - Understanding data architecture
- API Server Architecture - Internal architecture