Recipes, Overlays, and Mixins
The recipe data layer is the rule-based engine that turns a Criteria
query ({service, accelerator, intent, os, platform, nodes}) into a
resolved RecipeResult — the merged spec, component refs, deployment
order, and validation phases that aicr bundle consumes.
This page covers everything related to AICR recipes for contributors:
the three layers that contribute data (registry, overlay,
mixin), the on-disk schemas for each, the resolver’s merge
algorithm, and the invariants the resolver enforces. End-user recipe
authoring lives in
recipe-development.md; this
page is for contributors changing recipe content or extending the
resolver in pkg/recipe.
Where does my change go? Most changes hit exactly one of three files. Skim Decision matrix before editing — picking the wrong layer leaks defaults across recipes or duplicates content across overlays.
Layered Model
Resolution: the resolver loads the base spec (overlays/base.yaml) as
the merge seed, then merges each matching overlay’s inheritance chain
on top (base → … → leaf), then applies the leaf’s mixins, then
finally injects registry defaults for any component field the chain
left unset. Per-component values files (recipes/components/<name>/)
are pulled in at bundle time, not at recipe resolution.
Decision Matrix
Rule of thumb: a change targeting all recipes goes in registry; a change targeting one cluster shape goes in an overlay; a change shared by ≥ 2 overlays as an opt-in fragment goes in a mixin.
Registry (recipes/registry.yaml)
The registry is the component catalog. Each entry declares a chart or
kustomization the resolver can reference and supplies defaults the
resolver injects into any ComponentRef that leaves the field unset.
Top-level schema (ComponentRegistry):
ComponentConfig fields (see pkg/recipe/components.go):
HelmConfig: defaultRepository, defaultChart, defaultVersion,
defaultNamespace. KustomizeConfig: defaultSource, defaultPath,
defaultTag. A component must have either helm or kustomize,
not both.
pkg/component/generic.go carries a ComponentConfig marked
Deprecated: — that is a separate, unused-in-production legacy type;
the live ComponentConfig is the one in pkg/recipe/components.go.
Defaults flow into a ComponentRef only when the field is empty —
see applyRegistryDefaults below.
Overlay (recipes/overlays/)
An overlay is a RecipeMetadata document with a spec.criteria block
that selects it for matching queries. Overlays live in
recipes/overlays/ and inherit single-parent via spec.base.
Criteria fields (see pkg/recipe/criteria.go type Criteria):
--data overlays may contribute additional values via the criteria
registry — Has(FieldX, ...) is consulted when a value misses the
fast-path switch in Parse<X>. Adding a new value to a Go enum
(e.g., a new accelerator) is multi-file work; audit
CriteriaAccelerator* callers as listed in CLAUDE.md before merging.
Specificity. Each criteria carries a specificity score equal to
the count of non-any, non-empty fields. The current Specificity()
in criteria.go counts six fields: service, accelerator,
intent, os, platform, nodes. Overlays are sorted by
specificity ascending, so less-specific overlays merge first.
Matching is asymmetric. Recipe-side any is a wildcard (matches
anything in the query); query-side any is not a wildcard (matches
only recipe-side any). A generic query never resolves to a
hardware-specific recipe. See MatchesCriteriaField in
criteria.go.
Inheritance. spec.base walks a single-parent chain from leaf →
… → base (the root spec, held separately on the metadata store).
Cycles are detected at catalog load. Per-field merge: constraints
merge by name (later wins on same name; new appended); componentRefs
merge by name field-by-field; criteria are not inherited (each
recipe declares its own).
Leaf. A leaf is the most specific overlay in a chain — the
terminal node carrying fully-qualified criteria (every relevant
dimension set, e.g. service + accelerator + os + intent +
platform) that an end-user query actually resolves to. A leaf
usually adds little of its own (often componentRefs: []); its job is
to bind one inheritance chain plus its mixins under a single
criteria fingerprint. “Base → … → leaf” throughout this page refers
to walking from the root spec down to this node. Leaf is a role, not a
distinct kind — every overlay is a RecipeMetadata; “leaf” just
names the ones at the end of a chain.
Mixin Composition
Inheritance is single-parent, which means cross-cutting concerns (OS
constraints, platform components) would otherwise duplicate across
every leaf. Mixins are composable fragments referenced via
spec.mixins. They live in recipes/mixins/ and use kind
RecipeMixin.
Mixin files currently in the tree: os-ubuntu, os-talos,
platform-inference, platform-kubeflow.
Mixin rules:
- A mixin carries only
constraintsandcomponentRefs. Settingcriteria,base,mixins, orvalidationis rejected at load. - Resolution order: base chain merged first, then mixins applied to
the merged result. A leaf adopts a mixin by listing its file
basename in
spec.mixins. - Mixin componentRefs are restricted to additive merges via
mixinComponentRefSafeForMerge(seepkg/recipe/metadata_store.go). A mixin componentRef may only setname,namespace,manifestFiles,preManifestFiles. Setting any ofchart,type,source,version,tag,path,valuesFile,overrides,patches,dependencyRefs,cleanup,expectedResources,healthCheckAssertsis rejected at compose time — those fields silently override the chain’s chosen chart, so the resolver names the offending field and refuses to merge (see ADR-005 “Silent constraint override” mitigation). - When a snapshot evaluator is wired in, mixin constraints are evaluated against it after merging; failure invalidates the entire composed candidate. In plain query mode mixin constraints are merged but not evaluated.
Criteria Wildcard Overlays
Some overlays apply across an entire criteria dimension without being
referenced via spec.base or spec.mixins. The resolver picks them
up automatically because FindMatchingOverlays returns all maximal
matches, not just the most specific one. Two wildcard patterns in
the tree today: gb200-any.yaml (matches service: any) and
monitoring-hpa.yaml (matches intent: any).
For a query {service: eks, accelerator: gb200, intent: training},
the resolver returns three independent maximal leaves —
gb200-eks-training (matched by explicit criteria), gb200-any
(matched by service: any), and monitoring-hpa (matched by
intent: any). Each leaf’s inheritance chain is resolved separately
and merged onto the base spec in specificity order.
Maximal-leaf filter. filterToMaximalLeaves (in
metadata_store.go) drops any match that is a transitive
spec.base ancestor of another match — ancestors re-enter the
output via chain resolution, so keeping them as separate matches
would double-count their contributions. Independent leaves on
unrelated chains (wildcard + explicit) are kept; one is not an
ancestor of the other.
When to use a wildcard overlay vs a mixin:
Precedence. Leaves merge in specificity-ascending order, so a
service-specific leaf overrides the wildcard on same-named
constraints. spec.validation.<phase> blocks merge per-field:
checks and constraints union (nil = inherit, [] = clear,
non-empty = union); nodeSelection and infrastructure are
wholesale-replace. Don’t carry per-fabric values in a wildcard
(NCCL bandwidth thresholds differ per service); reserve wildcards
for content genuinely uniform across the wildcard dimension.
Merge Algorithm
The resolver lives in pkg/recipe/metadata_store.go. The merge
proceeds in fixed precedence (low → high):
Each step wins over everything to its left — --set overrides the
overlay leaf, the leaf overrides the base chain, and so on. Read as
priority, not as temporal order.
Implementation notes:
- Seed.
initBaseMergedSpec()cloness.Base(parsed fromoverlays/base.yaml) into the merge target. The base spec is held separately on the metadata store; it is not an overlay candidate inFindMatchingOverlays. - Chain merge. For each maximal leaf, the inheritance chain is
walked root → leaf and
mergedSpec.Merge(&recipe.Spec)is called for each. Same-named constraints/componentRefs override; new entries append. - Mixin merge.
mergeMixins(mergedSpec)walksspec.mixinson the leaf, loads each fromrecipes/mixins/, and appends.mixinComponentRefSafeForMergerejects mixin componentRefs that touch identity/sourcing fields. - Registry defaults.
applyRegistryDefaults(provider, refs)fills in chart/version/namespace/source/tag/path defaults for anyComponentReffield still empty after the chain merge. Failure to load the registry is propagated, not swallowed — partial refs would fail downstream far from the root cause. - Topological sort.
TopologicalSort()orders components bydependencyRefsfor the finalDeploymentOrder. Cycles produceErrCodeInvalidRequest.
Deep-copy semantics. deepMergeMap (metadata.go) recurses into
nested map[string]any. Non-map values (scalars and []any) are
deep-copied via serializer.DeepCopyAny so dst never aliases
src’s slice values. This matters: copying []any by reference
during overlay merge would let a downstream mutation (e.g., bundler
appending a toleration) leak back into the cached source map and
corrupt subsequent queries. The
CLAUDE.md
anti-patterns list calls this out — any new helper that touches
overlay-derived maps must follow the same rule.
Determinism
Recipe output is reproducible: same inputs → same bytes. The data layer enforces this via two rules.
Use serializer.MarshalYAMLDeterministic for any output that feeds
a digest, signature, OCI manifest, or fingerprint. yaml.v3 walks
Go maps in randomized order, so two consecutive marshals of the same
map[string]any produce different byte sequences. Plain
yaml.Marshal is fine for human-readable scratch output but is a
correctness bug anywhere a downstream consumer hashes the bytes.
Per-dimension ordered lists, not unordered maps. RecipeResult
fields like appliedOverlays, componentRefs, deploymentOrder,
and the per-dimension fingerprint diff are ordered slices, not maps,
so iteration is deterministic.
Recipe Store Immutability
The metadata store is read-only after init. LoadMetadataStoreFor(dp)
returns a sync.Once-cached *MetadataStore per DataProvider
identity, so concurrent recipe builds against the same provider share
the store without locks. Per-request mutations (chain resolution,
constraint evaluation, registry defaulting) happen on clones, never
on the cached spec.
Deferred registration. pendingRegistryEntry stages each
overlay’s criteria for the per-provider criteria registry before
registration. The actual Register(field, value, origin) calls only
fire after every overlay parses cleanly, the base recipe is present,
and dependency validation passes. Partial catalog loads never leak
into the registry; a malformed overlay does not poison criteria
validation for the next process.
Eviction. EvictCachedStore(provider) and
EvictCachedRegistry(provider) drop a single provider’s cache entry
without disturbing other providers. Use after rewriting a --data
overlay on disk.
Observable RecipeResult Surfaces
RecipeResult (in pkg/recipe/metadata.go) is the resolver’s
externally-visible product. Fields beyond ComponentRefs and
DeploymentOrder that contributors should be aware of:
ComponentRef extras beyond the chart-identity fields:
Adding a Recipe
- Decide registry vs overlay vs mixin (decision matrix).
- Write the YAML in the correct directory. For an overlay, set
spec.baseto the most specific shared ancestor and let the chain carry shared constraints; only declare what differs. - Ship the chainsaw health check (registry entries only). Every
new component in
recipes/registry.yamlMUST declarehealthCheck.assertFilepointing atrecipes/checks/<name>/health-check.yaml, and that file MUST use only the read-onlyassert/erroroperation allowlist (noscript,apply,wait,command, etc. — seevalidators/chainsaw/allowlist.go). The contract is enforced at PR time bypkg/recipe.TestComponentRegistry_RequiresHealthCheckandvalidators/chainsaw.TestValidateTestReadOnly_RegistryContent— both gatemake qualify. See #1223 and the chainsaw health check section in /aicr/contributor-guide/validators for the assertion patterns currently in use (DaemonSetnumberReady == desiredNumberScheduled, DeploymentAvailable=True, CRDEstablished=True). - Run
make bom-docsand commitdocs/user/container-images.mdif your change touchesregistry.yaml, a component’svalues.yaml, or a chart version pin (see BOM regeneration). - Unit tests.
make testruns the recipe-resolution suite —pkg/recipe/yaml_test.go(static catalog: parse, refs, enum values, inheritance depth, no cycles) andpkg/recipe/metadata_test.go(runtime merge, topological sort). Both gatemake qualify. If your change adds a registry entry, a new overlay file, or a mixin, the static suite typically picks it up without new test code. - Integration validation. For a new chart pin, run
make qualifyand let the e2e pipeline render the bundle. KWOK simulated clusters (make kwok-e2e RECIPE=<name>) catch most resolution regressions without GPU hardware.
BOM Regeneration
docs/user/container-images.md is auto-generated from the actual
rendered Helm templates of every chart referenced by the registry. It
is regenerated by make bom-docs.
Run make bom-docs and commit the regenerated
docs/user/container-images.md in the same PR whenever you:
- Add or remove a component in
recipes/registry.yaml - Bump a chart version (in
registry.yaml, an overlay, or a mixin) - Modify a component’s
values.yamlin a way that changes which images render (image repo override, subchart enable/disable, etc.)
The regen can also surface drift from upstream chart updates — when a chart bumps an image inside its own templates without a registry pin change on our side. That drift will appear in the BOM diff whether you expected it or not.
Freshness is not gated at merge time. make bom-check verifies
the committed BOM matches a fresh regen, but it is opt-in only —
not wired into make qualify, make lint, or the PR gate. Do not
rely on local qualify or CI to catch a missed regen. Wiring
bom-check into the gate is a desirable follow-up.
Common Pitfalls
- Skipping
make bom-docsafter a chart pin or values change. The diff doesn’t surface in qualify; the BOM goes stale silently. - Mutating in place during merge. Overlay-derived
map[string]anyand[]anymust be deep-copied, not aliased.deepMergeMapdoes this for you; a bespoke helper that recurses into maps but copies[]anyby reference will alias and corrupt the cached source map. - Plain
yaml.Marshalon output that feeds a digest. Useserializer.MarshalYAMLDeterministicfor any byte sequence a downstream consumer hashes (evidence predicate body, OCI manifest, signature input, fingerprint). - Adding a new criteria value to the Go enum but missing call
sites. A new accelerator, OS, intent, or platform value is
enumerated in many files — the criteria registry, OpenAPI spec,
every docs page that lists current values, issue templates, the
Specificity()helper. Start from the Go type incriteria.goand follow the audit list in CLAUDE.md. - Setting identity fields in a mixin componentRef. A mixin may
not set
chart,version,valuesFile, etc. — the resolver rejects with the offending field name. Move chart-changing logic to an overlay. - Assuming the cluster fingerprint is trustworthy. The
fingerprint block persisted in
aicr snapshotoutput is advisory; trust-bearing consumers recompute viafingerprint.FromMeasurements(...)before acting. See the collector docs and ADR-007 for details.
See Also
- recipe-development.md — end-user recipe authoring guide
- /aicr/contributor-guide/components — adding a component to the registry
- /aicr/contributor-guide/validators — adding bundle-time component validation checks
- /aicr/contributor-guide/validators — adding a validator check or health check
- ADR-005 — overlay refactoring rationale (mixin composition, maximal-leaf resolver, wildcard overlays)
- ADR-007 — fingerprint, evidence bundle, verification
- pkg/recipe godoc — implementation
- api/aicr/v1/server.yaml — recipe API contract and criteria enums