API Reference

View as Markdown

Complete reference for using the AICR API Server.

Overview

The AICR API Server provides HTTP REST access to recipe generation and bundle creation for GPU-accelerated infrastructure. Use the API for programmatic access to configuration recommendations and deployment artifacts.

┌──────────────┐ ┌──────────────┐
│ GET /recipe │─────▶│ Recipe │
└──────────────┘ └──────────────┘
┌──────────────┐ ┌──────────────┐
│ POST /bundle │─────▶│ bundles.zip │
└──────────────┘ └──────────────┘

API vs CLI

  • Use the API for remote recipe generation and bundle creation
  • Use the CLI for local operations, snapshot capture, and ConfigMap integration
FeatureAPICLI
Recipe generation✅ GET /v1/recipeaicr recipe
Value query✅ GET /v1/queryaicr query
Bundle creation✅ POST /v1/bundleaicr bundle
Snapshot capture❌ Use CLIaicr snapshot
ConfigMap I/O❌ Use CLIcm:// URIs
Agent deployment❌ Use CLIaicr snapshot

Base URL

Local development (example):

http://localhost:8080

Start the local server:

$docker pull ghcr.io/nvidia/aicrd:latest
$docker run -p 8080:8080 ghcr.io/nvidia/aicrd:latest

Quick Start

Get a Recipe

Generate an optimized configuration recipe for your environment:

$# GET: Basic recipe for H100 on EKS (query parameters)
$curl "http://localhost:8080/v1/recipe?accelerator=h100&service=eks"
$
$# GET: Training workload on Ubuntu
$curl "http://localhost:8080/v1/recipe?accelerator=h100&service=eks&intent=training&os=ubuntu"
$
$# POST: Recipe from criteria file (YAML body)
$curl -X POST "http://localhost:8080/v1/recipe" \
> -H "Content-Type: application/x-yaml" \
> -d 'kind: RecipeCriteria
>apiVersion: aicr.nvidia.com/v1alpha1
>metadata:
> name: my-config
>spec:
> service: eks
> accelerator: h100
> intent: training'
$
$# Save recipe to file
$curl -s "http://localhost:8080/v1/recipe?accelerator=h100&service=eks" -o recipe.json

Generate Bundles

Create deployment bundles from a recipe:

$# Pipe recipe directly to bundle endpoint
$curl -s "http://localhost:8080/v1/recipe?accelerator=h100&service=eks" | \
> curl -X POST "http://localhost:8080/v1/bundle?bundlers=gpu-operator" \
> -H "Content-Type: application/json" -d @- -o bundles.zip
$
$# Extract the bundles
$unzip bundles.zip -d ./bundles

Endpoints

GET /

Service information and available routes.

$curl "http://localhost:8080/"

Response:

1\{
2 "service": "aicrd",
3 "version": "v0.7.6",
4 "routes": ["/v1/recipe", "/v1/query", "/v1/bundle"]
5\}

GET /v1/recipe

Generate an optimized configuration recipe based on environment parameters.

Query Parameters:

ParameterTypeDefaultDescription
servicestringanyK8s service: eks, gke, aks, oke, kind, lke, bcm, any
acceleratorstringanyGPU type: h100, h200, gb200, b200, a100, l40, rtx-pro-6000, any
gpustringanyAlias for accelerator
intentstringanyWorkload: training, inference, any
osstringanyNode OS: ubuntu, rhel, cos, amazonlinux, talos, any
platformstringanyPlatform/framework: dynamo, kubeflow, nim, runai, slurm, any
nodesinteger0GPU node count (0 = any)

Examples:

$# Minimal request
$curl "http://localhost:8080/v1/recipe"
$
$# Specify accelerator
$curl "http://localhost:8080/v1/recipe?accelerator=h100"
$
$# Full specification
$curl "http://localhost:8080/v1/recipe?service=eks&accelerator=h100&intent=training&os=ubuntu&nodes=8"
$
$# Using gpu alias
$curl "http://localhost:8080/v1/recipe?gpu=gb200&service=gke"
$
$# Pretty print with jq
$curl -s "http://localhost:8080/v1/recipe?accelerator=h100" | jq '.'

POST /v1/recipe

Generate an optimized configuration recipe from a criteria file body. This endpoint provides an alternative to query parameters, accepting a Kubernetes-style RecipeCriteria resource in the request body.

Content Types:

  • application/json - JSON format
  • application/x-yaml - YAML format

Request Body:

The request body must be a RecipeCriteria resource:

1kind: RecipeCriteria
2apiVersion: aicr.nvidia.com/v1alpha1
3metadata:
4 name: my-criteria
5spec:
6 service: eks
7 accelerator: gb200
8 os: ubuntu
9 intent: training
10 platform: kubeflow
11 nodes: 8

Examples:

$# POST with YAML body
$curl -X POST "http://localhost:8080/v1/recipe" \
> -H "Content-Type: application/x-yaml" \
> -d 'kind: RecipeCriteria
>apiVersion: aicr.nvidia.com/v1alpha1
>metadata:
> name: training-config
>spec:
> service: eks
> accelerator: h100
> intent: training'
$
$# POST with JSON body
$curl -X POST "http://localhost:8080/v1/recipe" \
> -H "Content-Type: application/json" \
> -d '\{
> "kind": "RecipeCriteria",
> "apiVersion": "aicr.nvidia.com/v1alpha1",
> "metadata": \{"name": "training-config"\},
> "spec": \{
> "service": "eks",
> "accelerator": "h100",
> "intent": "training"
> \}
> \}'
$
$# POST with criteria file
$curl -X POST "http://localhost:8080/v1/recipe" \
> -H "Content-Type: application/yaml" \
> -d @criteria.yaml
$
$# Pretty print response
$curl -s -X POST "http://localhost:8080/v1/recipe" \
> -H "Content-Type: application/json" \
> -d '\{"kind":"RecipeCriteria","apiVersion":"aicr.nvidia.com/v1alpha1","spec":\{"service":"eks","accelerator":"h100"\}\}' \
> | jq '.'

Error Responses:

  • 400 Bad Request - Invalid criteria format, missing required fields, or invalid enum values
  • 405 Method Not Allowed - Only GET and POST are supported

Response:

1\{
2 "apiVersion": "aicr.nvidia.com/v1alpha1",
3 "kind": "Recipe",
4 "metadata": \{
5 "version": "v1.0.0",
6 "created": "2026-01-11T10:30:00Z",
7 "appliedOverlays": [
8 "base",
9 "eks",
10 "eks-training",
11 "gb200-eks-training"
12 ],
13 "excludedOverlays": [
14 \{
15 "name": "h100-eks-ubuntu-training",
16 "reason": "mixin-constraint-failed"
17 \}
18 ],
19 "constraintWarnings": [
20 \{
21 "overlay": "h100-eks-ubuntu-training",
22 "constraint": "OS.sysctl./proc/sys/kernel/osrelease",
23 "expected": ">= 6.8",
24 "actual": "5.15.0",
25 "reason": "mixin-constraint-failed: expected >= 6.8, got 5.15.0"
26 \}
27 ]
28 \},
29 "criteria": \{
30 "service": "eks",
31 "accelerator": "gb200",
32 "intent": "training",
33 "os": "any",
34 "platform": "any"
35 \},
36 "componentRefs": [
37 \{
38 "name": "gpu-operator",
39 "version": "v25.3.3",
40 "order": 1,
41 "repository": "https://helm.ngc.nvidia.com/nvidia"
42 \},
43 \{
44 "name": "network-operator",
45 "version": "v25.4.0",
46 "order": 2,
47 "repository": "https://helm.ngc.nvidia.com/nvidia"
48 \}
49 ],
50 "constraints": \{
51 "driver": \{
52 "version": "580.82.07",
53 "cudaVersion": "13.1"
54 \}
55 \}
56\}

metadata.excludedOverlays is optional. When present, each entry includes the overlay name and a machine-readable reason such as constraint-failed or mixin-constraint-failed.


GET /v1/query

Query a specific value from a fully hydrated recipe. Resolves a recipe from criteria (same parameters as GET /v1/recipe), merges all base, overlay, and inline overrides, then returns the value at the given selector path.

Query Parameters:

All GET /v1/recipe parameters are supported, plus:

ParameterTypeRequiredDescription
selectorstringYesDot-delimited path to the value to extract (e.g. components.gpu-operator.values.driver.version). Empty string returns the entire hydrated recipe.

Response:

  • Scalar values (string, number, bool) are returned as plain JSON values
  • Complex values (maps, lists) are returned as JSON objects/arrays

Examples:

$# Get a specific Helm value
$curl -s "http://localhost:8080/v1/query?service=eks&accelerator=h100&intent=training&selector=components.gpu-operator.values.driver.version"
$
$# Get deployment order
$curl -s "http://localhost:8080/v1/query?service=eks&accelerator=h100&intent=training&selector=deploymentOrder" | jq '.'
$
$# Get a component subtree
$curl -s "http://localhost:8080/v1/query?service=eks&accelerator=h100&selector=components.gpu-operator.values.driver" | jq '.'

POST /v1/bundle

Generate deployment bundles from a recipe.

Query Parameters:

ParameterTypeDefaultDescription
bundlersstring(all)Comma-delimited list of bundler types to execute
setstring[]Value overrides (format: bundler:path.to.field=value). Repeat for multiple.
dynamicstring[]Declare value paths as install-time parameters (format: component:path.to.field). Repeat for multiple. Supported with deployer=helm, deployer=argocd-helm, deployer=flux, and deployer=helmfile.
system-node-selectorstring[]Node selectors for system components (format: key=value). Repeat for multiple.
system-node-tolerationstring[]Tolerations for system components (format: key=value:effect). Repeat for multiple.
accelerated-node-selectorstring[]Node selectors for GPU nodes (format: key=value). Repeat for multiple.
accelerated-node-tolerationstring[]Tolerations for GPU nodes (format: key=value:effect). Repeat for multiple.
nodesint0Estimated number of GPU nodes (0 = unset). Written to Helm value paths declared in the registry under nodeScheduling.nodeCountPaths.
vendor-chartsboolfalsePull upstream Helm chart bytes into the bundle at bundle time so the artifact is fully self-contained and air-gap deployable. Each vendored chart is recorded in provenance.yaml with name, version, source URL, and SHA256. Trades the upstream CVE-yank fail-loud signal for offline deployability — see the CLI reference’s “Vendoring Charts for Air-Gap” section for the full tradeoff. Requires the helm binary on the API server’s $PATH and registry credentials configured for any private upstream repos (HELM_REPOSITORY_USERNAME/HELM_REPOSITORY_PASSWORD for HTTP(S); docker config for OCI). If prerequisites are missing the request fails with HTTP 500 and a structured error code (UNAVAILABLE for missing helm, UNAUTHORIZED for credentials).
deployerstringhelmDeployment method: helm, argocd, argocd-helm, flux, or helmfile
repostringGit repository URL for GitOps deployments (used with deployer=argocd and deployer=flux; ignored by deployer=argocd-helm)
app-namestringParent Argo Application name (default: aicr-stack for deployer=argocd-helm, nvidia-stack for deployer=argocd). Must be a DNS-1123 subdomain. Required when deploying multiple non-overlapping AICR bundles to the same Argo CD namespace so the parent Applications do not collide. For deployer=argocd-helm, the value is the chart default and can still be overridden at install time via helm install --set appName=.... Rejected with HTTP 400 on other deployers.

Request Body:

The request body is the recipe (RecipeResult) directly. No wrapper object needed.

Components

Bundler names correspond to component names in recipes/registry.yaml. Any component registered there can be passed as a bundler. Current components:

ComponentDescription
agentgatewayKubernetes Gateway API implementation for AI/ML inference (InferencePool routing)
agentgateway-crdsKubernetes Gateway API CRDs for AI/ML inference (Gateway API + Inference Extension)
aws-ebs-csi-driverAmazon EBS CSI driver (EKS)
aws-efaAWS Elastic Fabric Adapter device plugin (EKS)
cert-managerTLS certificate management
dynamo-platformNVIDIA Dynamo inference serving platform
gke-nccl-tcpxoNCCL TCPxO network plugin for optimized collective communication (GKE)
gpu-operatorNVIDIA GPU Operator — driver and runtime lifecycle
groveDynamo pod lifecycle management
k8s-ephemeral-storage-metricsEphemeral storage usage metrics
k8s-nim-operatorNVIDIA NIM Operator for inference microservice deployments
kai-schedulerDRA-aware gang scheduler with topology-aware placement
kube-prometheus-stackPrometheus, Grafana, Alertmanager monitoring stack
kubeflow-trainerKubeflow Training Operator for distributed training
kueueKubernetes-native job queuing for batch and AI workloads
network-operatorNVIDIA Network Operator — RDMA, SR-IOV, host networking
nfdNode Feature Discovery — labels nodes with hardware features; publishes per-node NodeResourceTopology CRDs on production GPU recipes
nodewright-customizationsEnvironment-specific node tuning profiles
nodewright-operatorOS-level node tuning and kernel configuration
nvidia-dra-driver-gpuDynamic Resource Allocation driver for GPUs
nvsentinelGPU health monitoring and automated remediation
prometheus-adapterCustom metrics for HPA scaling
prometheus-operator-crdsCRDs for the prometheus-operator (Alertmanager, Prometheus, ServiceMonitor, etc.)
slinky-slurmSlinky-managed Slurm cluster instance (Controller, LoginSet, NodeSet, RestApi); reconciled by slinky-slurm-operator
slinky-slurm-operatorSchedMD Slinky Slurm operator and admission webhook
slinky-slurm-operator-crdsCRDs for the SchedMD Slinky Slurm operator (slinky.slurm.net)

Examples:

$# Basic: pipe recipe to bundle (GPU Operator only)
$curl -s "http://localhost:8080/v1/recipe?accelerator=h100&service=eks" | \
> curl -X POST "http://localhost:8080/v1/bundle?bundlers=gpu-operator" \
> -H "Content-Type: application/json" -d @- -o bundles.zip
$
$# Advanced: with value overrides and Argo CD deployer
$curl -s "http://localhost:8080/v1/recipe?accelerator=h100&service=eks" | \
> curl -X POST "http://localhost:8080/v1/bundle?bundlers=gpu-operator&deployer=argocd&repo=https://github.com/my-org/my-gitops-repo.git&set=gpuoperator:gds.enabled=true" \
> -H "Content-Type: application/json" -d @- -o bundles.zip
$
$# With node scheduling for system and GPU nodes
$curl -X POST "http://localhost:8080/v1/bundle?bundlers=gpu-operator&system-node-selector=nodeGroup=system&system-node-toleration=dedicated=system:NoSchedule&accelerated-node-selector=nvidia.com/gpu.present=true&accelerated-node-toleration=nvidia.com/gpu=present:NoSchedule" \
> -H "Content-Type: application/json" \
> -d @recipe.json \
> -o bundles.zip
$
$# Generate GPU Operator bundle from saved recipe
$curl -X POST "http://localhost:8080/v1/bundle?bundlers=gpu-operator" \
> -H "Content-Type: application/json" \
> -d @recipe.json \
> -o bundles.zip
$
$# Generate all available bundles (no bundlers param)
$curl -X POST "http://localhost:8080/v1/bundle" \
> -H "Content-Type: application/json" \
> -d '\{
> "apiVersion": "aicr.nvidia.com/v1alpha1",
> "kind": "Recipe",
> "componentRefs": [
> \{"name": "gpu-operator", "version": "v25.3.3", "type": "helm"\},
> \{"name": "network-operator", "version": "v25.4.0", "type": "helm"\}
> ]
> \}' \
> -o bundles.zip
$
$# Generate multiple specific bundles
$curl -X POST "http://localhost:8080/v1/bundle?bundlers=gpu-operator,network-operator" \
> -H "Content-Type: application/json" \
> -d '\{
> "apiVersion": "aicr.nvidia.com/v1alpha1",
> "kind": "Recipe",
> "componentRefs": [
> \{"name": "gpu-operator", "version": "v25.3.3", "type": "helm"\},
> \{"name": "network-operator", "version": "v25.4.0", "type": "helm"\}
> ]
> \}' \
> -o bundles.zip

Response Headers:

HeaderDescriptionExample
Content-TypeAlways application/zipapplication/zip
Content-DispositionDownload filenameattachment; filename="bundles.zip"
X-Bundle-FilesTotal files in archive10
X-Bundle-SizeUncompressed size (bytes)45678
X-Bundle-DurationGeneration time1.234s

Bundle Structure

bundles.zip
├── gpu-operator/
│ ├── values.yaml # Helm chart values
│ ├── scripts/
│ │ ├── install.sh # Installation script
│ │ └── uninstall.sh # Cleanup script
│ ├── README.md # Deployment instructions
│ └── checksums.txt # SHA256 checksums
└── network-operator/
├── values.yaml
├── manifests/
│ └── nfd-network-rule.yaml # NodeFeatureRule for Mellanox NICs
└── ...

GET /health

Service health check (liveness probe).

$curl "http://localhost:8080/health"

Response:

1\{
2 "status": "healthy",
3 "timestamp": "2026-01-11T10:30:00Z"
4\}

GET /ready

Service readiness check (readiness probe).

$curl "http://localhost:8080/ready"

Response:

1\{
2 "status": "ready",
3 "timestamp": "2026-01-11T10:30:00Z"
4\}

GET /metrics

Prometheus metrics endpoint.

$curl "http://localhost:8080/metrics"

Key Metrics:

MetricTypeDescription
aicr_http_requests_totalcounterTotal HTTP requests by method, path, status
aicr_http_request_duration_secondshistogramRequest latency distribution
aicr_http_requests_in_flightgaugeCurrent concurrent requests
aicr_rate_limit_rejects_totalcounterRate limit rejections

Complete Workflow Example

Fetch a recipe and generate bundles in one workflow:

$#!/bin/bash
$
$# Step 1: Get recipe for H100 on EKS for training
$echo "Fetching recipe..."
$curl -s "http://localhost:8080/v1/recipe?accelerator=h100&service=eks&intent=training" \
> -o recipe.json
$
$# Display recipe summary
$echo "Recipe components:"
$jq -r '.componentRefs[] | " - \(.name): \(.version)"' recipe.json
$
$# Step 2: Generate bundles from recipe (pipe directly)
$echo "Generating bundles..."
$curl -s -X POST "http://localhost:8080/v1/bundle?bundlers=gpu-operator" \
> -H "Content-Type: application/json" \
> -d @recipe.json \
> -o bundles.zip
$
$# Alternative: one-liner without intermediate file
$# curl -s "http://localhost:8080/v1/recipe?accelerator=h100&service=eks" | \
># curl -X POST "http://localhost:8080/v1/bundle?bundlers=gpu-operator" \
># -H "Content-Type: application/json" -d @- -o bundles.zip
$
$# Step 3: Extract and verify
$echo "Extracting bundles..."
$unzip -q bundles.zip -d ./deployment
$
$# Verify checksums
$echo "Verifying checksums..."
$cd deployment/gpu-operator
$sha256sum -c checksums.txt
$
$# Step 4: Deploy (example)
$echo "Bundle ready for deployment:"
$ls -la

Error Handling

Error Response Format

1\{
2 "code": "ERROR_CODE",
3 "message": "Human-readable error description",
4 "details": \{ ... \},
5 "requestId": "550e8400-e29b-41d4-a716-446655440000",
6 "timestamp": "2026-01-11T10:30:00Z",
7 "retryable": true
8\}

Error Codes

CodeHTTP StatusDescriptionRetryable
INVALID_REQUEST400Invalid query parameters, request body, or disallowed criteria valueNo
METHOD_NOT_ALLOWED405Wrong HTTP methodNo
NO_MATCHING_RULE404No configuration foundNo
RATE_LIMIT_EXCEEDED429Too many requestsYes
INTERNAL_ERROR500Server errorYes

Handling Rate Limits

$# Check rate limit headers
$curl -I "http://localhost:8080/v1/recipe?accelerator=h100"
$
$# Response headers:
$# X-RateLimit-Limit: 100
$# X-RateLimit-Remaining: 95
$# X-RateLimit-Reset: 1736589000

When rate limited (HTTP 429), use the Retry-After header:

$# Retry with backoff
$response=$(curl -s -w "%\{http_code\}" "http://localhost:8080/v1/recipe?accelerator=h100")
$if [ "$\{response: -3\}" = "429" ]; then
$ retry_after=$(curl -sI "http://localhost:8080/v1/recipe" | grep -i "Retry-After" | awk '\{print $2\}')
$ echo "Rate limited. Retrying after $\{retry_after\}s..."
$ sleep "$retry_after"
$fi

Rate Limiting

  • Limit: 100 requests per second per IP
  • Burst: 200 requests
  • Headers: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset
  • 429 Response: Includes Retry-After header

Criteria Allowlists

The API server can be configured to restrict which criteria values are allowed. This enables operators to limit the API to specific accelerators, services, intents, or OS types.

Configuration

Allowlists are configured via environment variables when starting the server:

Environment VariableDescriptionExample
AICR_ALLOWED_ACCELERATORSComma-separated list of allowed GPU typesh100,l40
AICR_ALLOWED_SERVICESComma-separated list of allowed K8s serviceseks,gke
AICR_ALLOWED_INTENTSComma-separated list of allowed workload intentstraining
AICR_ALLOWED_OSComma-separated list of allowed OS typesubuntu,rhel

Behavior:

  • If an environment variable is not set, all values for that criteria are allowed
  • If an environment variable is set, only the specified values are permitted
  • The any value is always allowed regardless of allowlist configuration
  • Allowlists apply to both /v1/recipe and /v1/bundle endpoints

Example Configuration

$# Start server allowing only H100 and L40 GPUs on EKS
$docker run -p 8080:8080 \
> -e AICR_ALLOWED_ACCELERATORS=h100,l40 \
> -e AICR_ALLOWED_SERVICES=eks \
> ghcr.io/nvidia/aicrd:latest

Error Response

When a disallowed criteria value is requested:

$curl "http://localhost:8080/v1/recipe?accelerator=gb200&service=eks"

Response (HTTP 400):

1\{
2 "code": "INVALID_REQUEST",
3 "message": "accelerator type not allowed",
4 "details": \{
5 "requested": "gb200",
6 "allowed": ["h100", "l40"]
7 \},
8 "requestId": "550e8400-e29b-41d4-a716-446655440000",
9 "timestamp": "2026-01-27T10:30:00Z",
10 "retryable": false
11\}

CLI Behavior

The CLI (aicr) is not affected by allowlists. Allowlists only apply to the API server, allowing operators to restrict API access while maintaining full CLI functionality for administrative tasks.

Programming Language Examples

Python

1import requests
2import zipfile
3import io
4
5BASE_URL = "http://localhost:8080"
6
7# Get recipe
8params = \{
9 "accelerator": "h100",
10 "service": "eks",
11 "intent": "training",
12 "os": "ubuntu"
13\}
14
15resp = requests.get(f"\{BASE_URL\}/v1/recipe", params=params)
16resp.raise_for_status()
17recipe = resp.json()
18
19print(f"Recipe has \{len(recipe['componentRefs'])\} components")
20
21# Generate bundles — recipe is the request body, bundlers are query params
22resp = requests.post(
23 f"\{BASE_URL\}/v1/bundle",
24 params=\{"bundlers": "gpu-operator"\},
25 json=recipe,
26)
27resp.raise_for_status()
28
29# Extract zip
30with zipfile.ZipFile(io.BytesIO(resp.content)) as zf:
31 zf.extractall("./deployment")
32 print(f"Extracted \{len(zf.namelist())\} files")

Go

1package main
2
3import (
4 "encoding/json"
5 "fmt"
6 "io"
7 "net/http"
8 "net/url"
9 "os"
10)
11
12func main() \{
13 baseURL := "http://localhost:8080"
14
15 // Get recipe
16 params := url.Values\{\}
17 params.Add("accelerator", "h100")
18 params.Add("service", "eks")
19
20 resp, err := http.Get(baseURL + "/v1/recipe?" + params.Encode())
21 if err != nil \{
22 panic(err)
23 \}
24 defer resp.Body.Close()
25
26 var recipe map[string]interface\{\}
27 json.NewDecoder(resp.Body).Decode(&recipe)
28
29 fmt.Printf("Got recipe with %d components\n",
30 len(recipe["componentRefs"].([]interface\{\})))
31\}

JavaScript/Node.js

1const BASE_URL = "http://localhost:8080";
2
3async function main() \{
4 // Get recipe
5 const params = new URLSearchParams(\{
6 accelerator: "h100",
7 service: "eks",
8 intent: "training"
9 \});
10
11 const recipeResp = await fetch(`$\{BASE_URL\}/v1/recipe?$\{params\}`);
12 const recipe = await recipeResp.json();
13
14 console.log(`Recipe has $\{recipe.componentRefs.length\} components`);
15
16 // Generate bundles — recipe is the request body, bundlers are query params
17 const bundleResp = await fetch(`$\{BASE_URL\}/v1/bundle?bundlers=gpu-operator`, \{
18 method: "POST",
19 headers: \{ "Content-Type": "application/json" \},
20 body: JSON.stringify(recipe),
21 \});
22
23 // Save zip
24 const buffer = await bundleResp.arrayBuffer();
25 require("fs").writeFileSync("bundles.zip", Buffer.from(buffer));
26 console.log("Bundles saved to bundles.zip");
27\}
28
29main();

Shell Script (Batch Processing)

$#!/bin/bash
$# Generate recipes for multiple environments
$
$environments=(
> "os=ubuntu&accelerator=h100&service=eks"
> "os=ubuntu&accelerator=gb200&service=gke"
> "os=rhel&accelerator=a100&service=aks"
>)
$
$for env in "$\{environments[@]\}"; do
$ echo "Fetching recipe for: $env"
$
$ curl -s "http://localhost:8080/v1/recipe?$\{env\}" \
> | jq -r '.componentRefs[] | "\(.name): \(.version)"'
$
$ echo ""
$done

OpenAPI Specification

The full OpenAPI 3.1 specification is available at: api/aicr/v1/server.yaml

Generate client SDKs:

$# Download spec
$curl https://raw.githubusercontent.com/NVIDIA/aicr/main/api/aicr/v1/server.yaml \
> -o openapi.yaml
$
$# Generate Python client
$openapi-generator-cli generate -i openapi.yaml -g python -o ./python-client
$
$# Generate Go client
$openapi-generator-cli generate -i openapi.yaml -g go -o ./go-client
$
$# Generate TypeScript client
$openapi-generator-cli generate -i openapi.yaml -g typescript-fetch -o ./ts-client

Troubleshooting

Common Issues

“Invalid accelerator type” error:

$# Use valid values: h100, h200, gb200, b200, a100, l40, rtx-pro-6000, any
$curl "http://localhost:8080/v1/recipe?accelerator=h100"

“Recipe is required” error:

$# Ensure recipe is in request body
$curl -X POST "http://localhost:8080/v1/bundle" \
> -H "Content-Type: application/json" \
> -d '\{"recipe": \{...\}\}' # recipe must not be null

Empty zip file:

$# Check recipe has componentRefs
$curl -s "http://localhost:8080/v1/recipe?accelerator=h100" | jq '.componentRefs'

Connection refused (local):

$# Start local server first
$make server

See Also