Component Validation System

View as Markdown

Learn how to define and use component validations in AICR.

Note: This document covers component validations — condition-based checks that run during bundle generation (e.g., missing config, incompatible settings). For the container-per-validator engine used by aicr validate, see the Validator Development Guide and Validator Extension Guide.

Overview

The component validation system allows components to register validation checks that run automatically during bundle generation. Validations can check for missing configuration, incompatible settings, or other conditions that might cause deployment issues.

Key Features:

  • Component-Driven: Validations are defined in the component registry (recipes/registry.yaml)
  • Condition-Based: Validations run only when specific conditions are met (e.g., intent, service)
  • Severity Levels: Each validation can be a “warning” (non-blocking) or “error” (blocking)
  • Custom Messages: Optional detail messages provide actionable guidance
  • Extensible: New validation functions can be added without modifying core bundler code

Architecture

Defining Validations

Validations are defined in the component registry (recipes/registry.yaml) under each component’s configuration.

Validation Configuration Structure

1components:
2 - name: my-component
3 # ... other component config ...
4 validations:
5 - function: CheckFunctionName
6 severity: warning # or "error"
7 conditions:
8 intent:
9 - training
10 - inference
11 service:
12 - eks
13 message: "Optional detail message explaining the issue and how to resolve it"

Validation Fields

FieldTypeRequiredDescription
functionstringYesName of the validation function to execute (e.g., “CheckWorkloadSelectorMissing”)
severitystringYesSeverity level: “warning” (non-blocking) or “error” (blocking)
conditionsmap[string][]stringNoConditions that must be met for validation to run. Keys are criteria fields (intent, service, accelerator, os, platform). Values are arrays of strings for OR matching.
messagestringNoOptional detail message appended to validation results. Provides actionable guidance.

Conditions

Conditions use arrays of strings for OR matching. A single-element array is equivalent to a single value:

1# Single value (matches only "training")
2conditions:
3 intent:
4 - training
5
6# Multiple values (matches "training" OR "inference")
7conditions:
8 intent:
9 - training
10 - inference
11
12# Multiple conditions (all must match)
13conditions:
14 intent:
15 - training
16 service:
17 - eks
18 - gke

Supported Condition Keys:

  • intent: Workload intent (training, inference)
  • service: Kubernetes service (eks, gke, aks, oke, kind, lke)
  • accelerator: GPU type (h100, gb200, b200, a100, l40, rtx-pro-6000)
  • os: Operating system (ubuntu, rhel, cos, amazonlinux, talos)
  • platform: Platform/framework (kubeflow)

Example: Nodewright Customizations Validations

1components:
2 - name: nodewright-customizations
3 # ... other config ...
4 validations:
5 # Check for missing workload-selector when training intent
6 - function: CheckWorkloadSelectorMissing
7 severity: warning
8 conditions:
9 intent:
10 - training
11 message: "This may cause nodewright to evict running training jobs. Consider setting --workload-selector to prevent eviction."
12
13 # Check for missing accelerated-node-selector for training/inference
14 - function: CheckAcceleratedSelectorMissing
15 severity: warning
16 conditions:
17 intent:
18 - training
19 - inference
20 message: "Without this selector, the customization will run on all nodes. Consider setting --accelerated-node-selector to target specific nodes."

Available Validation Functions

CheckWorkloadSelectorMissing

Checks if --workload-selector is missing when conditions are met.

Use Case: Prevent nodewright from evicting running training jobs by ensuring workload selector is configured.

Example:

1validations:
2 - function: CheckWorkloadSelectorMissing
3 severity: warning
4 conditions:
5 intent:
6 - training
7 message: "This may cause nodewright to evict running training jobs. Consider setting --workload-selector to prevent eviction."

CheckAcceleratedSelectorMissing

Checks if --accelerated-node-selector is missing when conditions are met.

Use Case: Ensure node selectors are configured to target specific nodes rather than running on all nodes.

Example:

1validations:
2 - function: CheckAcceleratedSelectorMissing
3 severity: warning
4 conditions:
5 intent:
6 - training
7 - inference
8 message: "Without this selector, the customization will run on all nodes. Consider setting --accelerated-node-selector to target specific nodes."

CheckHostMofedWithoutNetworkOperator

Flags components requesting host-mode MOFED when the network-operator component is not in the recipe (host MOFED requires the network operator to manage the kernel modules).

Creating New Validation Functions

To add a new validation function, follow these steps:

Step 1: Implement the Validation Function

Create a new function in pkg/bundler/validations/checks.go:

1// CheckMyNewValidation checks for a specific condition.
2// This is a generic check that can be used by any component.
3func CheckMyNewValidation(ctx context.Context, componentName string, recipeResult *recipe.RecipeResult, bundlerConfig *config.Config, conditions map[string][]string) ([]string, []error) {
4 if bundlerConfig == nil {
5 return nil, nil
6 }
7
8 // Check if component exists in recipe
9 hasComponent := false
10 for _, ref := range recipeResult.ComponentRefs {
11 if ref.Name == componentName {
12 hasComponent = true
13 break
14 }
15 }
16
17 if !hasComponent {
18 return nil, nil
19 }
20
21 // Check conditions
22 if !checkConditions(recipeResult, conditions) {
23 return nil, nil
24 }
25
26 // Perform your validation check
27 // Example: Check if some config is missing
28 if someConfigMissing {
29 baseMsg := fmt.Sprintf("%s is enabled but required configuration is missing", componentName)
30 slog.Warn(baseMsg,
31 "component", componentName,
32 "conditions", conditions,
33 )
34 return []string{baseMsg}, nil
35 }
36
37 return nil, nil
38}

Step 2: Register the Function

Add the function to the auto-registration in pkg/bundler/validations/checks.go:

1// init auto-registers validation functions in this package.
2func init() {
3 registerCheck("CheckWorkloadSelectorMissing", CheckWorkloadSelectorMissing)
4 registerCheck("CheckAcceleratedSelectorMissing", CheckAcceleratedSelectorMissing)
5 registerCheck("CheckHostMofedWithoutNetworkOperator", CheckHostMofedWithoutNetworkOperator)
6 registerCheck("CheckMyNewValidation", CheckMyNewValidation) // Add your new function
7}

Step 3: Add to Component Registry

Add the validation to your component’s configuration in recipes/registry.yaml:

1components:
2 - name: my-component
3 # ... other config ...
4 validations:
5 - function: CheckMyNewValidation
6 severity: warning
7 conditions:
8 intent:
9 - training
10 message: "Optional detail message explaining the issue"

Step 4: Add Tests

Create tests in pkg/bundler/validations/checks_test.go:

1func TestCheckMyNewValidation(t *testing.T) {
2 tests := []struct {
3 name string
4 componentName string
5 recipeResult *recipe.RecipeResult
6 bundlerConfig *config.Config
7 conditions map[string][]string
8 wantWarnings int
9 wantErrors int
10 }{
11 // Test cases...
12 }
13
14 for _, tt := range tests {
15 t.Run(tt.name, func(t *testing.T) {
16 ctx := context.Background()
17 warnings, errors := CheckMyNewValidation(ctx, tt.componentName, tt.recipeResult, tt.bundlerConfig, tt.conditions)
18 // Assertions...
19 })
20 }
21}

Validation Function Interface

All validation functions must implement the ValidationFunc signature:

1type ValidationFunc func(
2 ctx context.Context,
3 componentName string,
4 recipeResult *recipe.RecipeResult,
5 bundlerConfig *config.Config,
6 conditions map[string][]string,
7) (warnings []string, errors []error)

Parameters:

  • ctx: Context for cancellation/timeout
  • componentName: Name of the component being validated (from recipe)
  • recipeResult: The recipe result containing component refs and criteria
  • bundlerConfig: The bundler configuration (for accessing CLI flags)
  • conditions: Conditions from the validation config (map of criteria field to array of values)

Returns:

  • warnings: List of warning messages (non-blocking, displayed to user)
  • errors: List of error messages (blocking, stops bundle generation)

Condition Matching Logic

The checkConditions function uses the matching logic from recipe/criteria.go for consistency:

  • Empty conditions: Validation always runs (no conditions to check)
  • Array values: OR matching - actual value must match any value in the array
  • Multiple conditions: AND matching - all conditions must match
  • Criteria matching: Uses matchesCriteriaField for consistent field matching

Example:

1conditions:
2 intent:
3 - training
4 - inference
5 service:
6 - eks

This validation runs when:

  • Intent is “training” OR “inference” AND
  • Service is “eks”

Severity Levels

Warning (Non-Blocking)

Warnings are displayed to the user but do not stop bundle generation. Use for missing optional configuration, best-practice recommendations, potential performance issues, or informational messages.

1severity: warning

Output:

Note:
⚠ Warning: Component is enabled but optional configuration is missing.

Error (Blocking)

Errors stop bundle generation and return an error. Use for missing required configuration, incompatible settings, critical deployment issues, or security concerns.

1severity: error

Output:

Error: Component validation failed: required configuration is missing

Best Practices

  1. Be Specific: Provide clear, actionable messages that explain what’s wrong and how to fix it
  2. Use Conditions Wisely: Only run validations when they’re relevant to avoid noise
  3. Prefer Warnings: Use warnings for non-critical issues; reserve errors for blocking problems
  4. Reuse Existing Checks: Use generic checks like CheckWorkloadSelectorMissing when possible
  5. Test Thoroughly: Add comprehensive tests covering all condition combinations
  6. Document Usage: Include examples in component documentation

Examples

Example 1: Simple Condition Check

1validations:
2 - function: CheckWorkloadSelectorMissing
3 severity: warning
4 conditions:
5 intent:
6 - training
7 message: "Training workloads should specify a workload selector to prevent eviction."

Example 2: Multiple Conditions

1validations:
2 - function: CheckAcceleratedSelectorMissing
3 severity: warning
4 conditions:
5 intent:
6 - training
7 - inference
8 service:
9 - eks
10 - gke
11 message: "EKS and GKE deployments should specify accelerated node selectors for GPU workloads."

Example 3: Error Severity

1validations:
2 - function: CheckRequiredConfig
3 severity: error
4 conditions:
5 intent:
6 - training
7 message: "Training workloads require this configuration. Bundle generation cannot continue."

Troubleshooting

Validation not running:

  • Check that the component is in the recipe’s componentRefs
  • Verify conditions match the recipe’s criteria
  • Check that the validation function is registered in checks.go init()

Warning not displayed:

  • Verify the validation function returns warnings (not empty slice)
  • Check that severity is “warning” (not “error”)
  • Ensure the bundler is collecting warnings correctly

Error stopping bundle:

  • Check that severity is “error” (not “warning”)
  • Verify the validation function returns errors
  • Review the error message for details