CLI Architecture
The aicr CLI provides command-line access to AICR configuration management capabilities.
Overview
The CLI provides a four-step workflow for optimizing GPU infrastructure, plus a query command for inspecting hydrated recipe values:
Step 1: Snapshot Command
Captures system configuration:
- Operating system: grub, kmod, sysctl, /etc/os-release
- SystemD services: containerd, docker, kubelet (service state and configuration)
- Kubernetes: API server version, container images, ClusterPolicy custom resource
- GPU hardware: driver version, CUDA libraries, MIG configuration, device properties
- Node topology (cluster-wide taints and labels)
Output destinations:
- File:
--output system.yaml(local filesystem) - Stdout: Default (can be piped to other commands)
- ConfigMap:
--output cm://namespace/name(Kubernetes ConfigMap using Kubernetes API)
Agent deployment:
Kubernetes Job runs on GPU nodes. Writes snapshot to ConfigMap via Kubernetes API. Requires ServiceAccount with ConfigMap create/update permissions (Role in target namespace). Does not require PersistentVolume.
Step 2: Recipe Command
Generates optimized configuration recipes with two modes:
- Query Mode: Direct recipe generation from system parameters (OS, GPU, K8s, etc.)
- Snapshot Mode: Analyzes captured snapshots and generates tailored recipes based on workload intent (training/inference)
Input Options:
- Query parameters:
--os ubuntu --gpu gb200 --service eks(direct recipe generation) - Snapshot file:
--snapshot system.yaml(analyze captured snapshot) - ConfigMap:
--snapshot cm://namespace/name(read from Kubernetes)
Output Options:
- File:
--output recipe.yaml(write to file) - Stdout: Default behavior (pipe to bundle command)
- ConfigMap:
--output cm://namespace/name(store in Kubernetes)
Step 3: Validate Command
Validates recipe constraints against actual system measurements from a snapshot.
Input sources:
- Recipe file:
--recipe recipe.yaml(local filesystem) - Recipe URL:
--recipe https://example.com/recipe.yaml(HTTP/HTTPS) - Recipe ConfigMap:
--recipe cm://namespace/name(Kubernetes ConfigMap) - Snapshot file:
--snapshot snapshot.yaml(local filesystem) - Snapshot ConfigMap:
--snapshot cm://namespace/name(Kubernetes ConfigMap)
Constraint format:
Constraints use fully qualified measurement paths: {Type}.{Subtype}.{Key}
K8s.server.version- Kubernetes server versionOS.release.ID- Operating system identifierOS.release.VERSION_ID- OS versionOS.sysctl./proc/sys/kernel/osrelease- Kernel version
Supported operators:
>= 1.30- Greater than or equal (version comparison)<= 1.33- Less than or equal (version comparison)> 1.30,< 2.0- Strict comparison== ubuntu,!= rhel- Equality operatorsubuntu- Exact string match (no operator)
Output:
- Validation result with summary (passed/failed/skipped counts)
- Individual constraint results with expected vs actual values
- Status:
pass,fail, orpartial(some skipped)
CI/CD integration:
By default, the command exits with non-zero status when constraints fail (ideal for CI/CD). To run in informational mode without failing:
Step 4: Bundle Command
Generates deployment artifacts from recipes:
- Helm values files (values.yaml)
- Kubernetes manifests (ClusterPolicy, NICClusterPolicy, etc.)
- SHA256 checksum file
- README documentation: root
bundle/README.mdis generated by the deployer; per-componentbundle/<component>/README.mdis generated by each component bundler
Input sources:
- Recipe file:
--recipe recipe.yaml(local filesystem) - ConfigMap:
--recipe cm://namespace/name(Kubernetes ConfigMap)
Output: Local directory only. ConfigMap output is not supported for bundles.
Current bundlers:
- GPU Operator: Generates GPU Operator Helm values and ClusterPolicy manifest
- Network Operator: Generates Network Operator Helm values and NICClusterPolicy manifest
- Cert-Manager: Generates cert-manager Helm values for certificate management
- NVSentinel: Generates NVSentinel Helm values
- Nodewright: Generates Nodewright Operator Helm values and Nodewright CR manifest for node optimization
Value overrides:
The --set flag allows runtime customization of generated bundle values:
Node scheduling options:
The bundle command supports node selector and toleration flags for controlling workload placement:
Flags:
--system-node-selector key=value– Node selector for system components (repeatable)--system-node-toleration key=value:effect– Toleration for system components (repeatable)--accelerated-node-selector key=value– Node selector for GPU nodes (repeatable)--accelerated-node-toleration key=value:effect– Toleration for GPU nodes (repeatable)--nodes N– Estimated number of GPU nodes (bundle-time only; written to paths in registry undernodeScheduling.nodeCountPaths)
These flags apply selectors/tolerations to bundler-specific paths (e.g., GPU Operator uses operator.nodeSelector and daemonsets.nodeSelector). The --nodes value is applied to paths listed in the registry under nodeScheduling.nodeCountPaths.
Execution model:
- Bundlers run concurrently (parallel execution)
- All components from the recipe are bundled automatically
- Errors from any bundler cause immediate cancellation via context propagation
Testing: End-to-end workflow validated by Chainsaw tests in tests/chainsaw/cli/
Architecture Diagram
ConfigMap Integration
The CLI supports Kubernetes-native ConfigMap storage using the cm://namespace/name URI scheme:
Benefits:
- No file dependencies - Direct Kubernetes API integration
- Agent-friendly - Jobs can write snapshots without volumes
- Pipeline integration - CI/CD can read/write ConfigMaps
- Multi-cluster - Share snapshots/recipes across clusters
RBAC Requirements:
- ConfigMap read/write permissions in target namespace
- ServiceAccount with appropriate Role/RoleBinding
- See Agent Deployment for details
Component Details
Entry Point: cmd/aicr/main.go
Minimal entry point that delegates to the CLI package:
Root Command: pkg/cli/root.go
Responsibilities:
- Command registration and routing
- Version information injection (via ldflags)
- Global flag handling (debug mode, log formatting)
- Logging mode selection and initialization
Key Features:
- Version info:
version,commit,date(overridden at build time) - Three logging modes:
- CLI Mode (default): Minimal output for users (
SetDefaultCLILogger) - Text Mode (
--debug): Full metadata for debugging (SetDefaultLoggerWithLevel) - JSON Mode (
--log-json): Structured logs for automation (SetDefaultStructuredLoggerWithLevel)
- CLI Mode (default): Minimal output for users (
- Logger selection logic:
- Shell completion support
- Command listing for auto-completion
Snapshot Command: pkg/cli/snapshot.go
Captures comprehensive system configuration snapshots.
Command Flow
Detailed Data Flow
Usage Examples
Agent Deployment Pattern
The snapshot command can be deployed as a Kubernetes Job for automated cluster auditing:
Deployment:
RBAC Requirements:
Key Points:
- No volumes needed - writes directly via Kubernetes API
- RBAC RoleBinding must reference correct namespace
- ConfigMap automatically created if doesn’t exist
- Supports update pattern (overwrite existing snapshots)
- RBAC and Job resources are created programmatically by
pkg/k8s/agent
Recipe Command: pkg/cli/recipe.go
Generates optimized configuration recipes based on environment parameters.
Command Flow
Detailed Data Flow
Recipe Matching Algorithm
The recipe matching uses an asymmetric rule-based query system where overlay criteria (rules) match against user queries (candidates):
Asymmetric Matching Rules:
- All non-empty fields in the overlay criteria must be satisfied by the query
- Empty overlay field → Wildcard (matches any query value)
- Query “any” field → Only matches overlay “any” (does NOT match specific overlays)
- Version fields use semantic version equality with precision awareness
This asymmetric behavior ensures generic queries (e.g., --service eks --intent training) don’t match overly specific recipes (e.g., recipes requiring accelerator: gb200).
Usage Examples
Recipe Command Modes
The recipe command supports two modes of operation:
Query Mode (Default)
Direct recipe generation from environment parameters:
Snapshot Mode
Analyze captured snapshots and generate tailored recipes:
Query Extraction from Snapshot
When using snapshot mode, the recipe builder extracts environment parameters from the snapshot:
From OS Measurements:
- release subtype → OS family (ubuntu, rhel, cos, amazonlinux, talos)
From Kubernetes Measurements:
- server subtype → K8s service provider (eks, gke, aks) inferred from images
From GPU Measurements:
- Product Name → GPU type detection (H100, GB200, B200, A100, L40, RTX PRO 6000)
- Maps product names to normalized accelerator types for recipe matching
Intent Types:
- training – Optimize for high throughput, batch processing, multi-GPU orchestration
- inference – Optimize for low latency, single-request performance, efficient batching
- any – Provides general-purpose recommendations applicable to both workloads
External Data Directory
The --data flag enables extending embedded recipe data with external files:
Requirements:
- External directory must contain
registry.yaml - No symlinks allowed (security)
- Max file size: 10MB per file
Merge Rules:
registry.yaml: Components merged by name (external overrides embedded)- All other files: External replaces embedded if path matches
Usage Examples
Recipe Output Structure
Error Handling
-
Query Mode:
- Invalid parameter values: Returns error with supported options
- Missing required parameters: Allows “any” as default fallback
- No matching overlays: Returns recipe with base configuration
-
Snapshot Mode:
- Missing snapshot file: File not found error with path
- Invalid snapshot format: Parse error with details
- Invalid intent: Returns error with supported intent types (training, inference, any)
- Extraction failures: Best-effort extraction with partial criteria
Common Errors:
- Unknown output format: Error with supported formats list (json, yaml)
Query Command: pkg/cli/query.go
Extracts specific values from the fully hydrated recipe configuration using dot-path selectors.
Command Flow
Hydration Process
The query command builds a fully hydrated map[string]any from the RecipeResult:
- Recipe-level fields (criteria, metadata, deploymentOrder, constraints) are mapped directly
- Each
ComponentRefis expanded into a component map with metadata fields (name, chart, source, version, etc.) GetValuesForComponentis called per component to merge base values, overlay values, and inline overrides- The merged values are inlined under each component’s
valueskey
Selector Resolution
The selector uses dot-delimited path walking. Leading dots are stripped (yq-style), so .components.X and components.X are equivalent. An empty selector or . returns the entire hydrated map.
Usage Examples
Implementation: pkg/recipe/query.go (HydrateResult, Select)
Bundle Command: pkg/cli/bundle.go
Generates deployment-ready bundles (Helm values, Kubernetes manifests, installation scripts) from recipes.
Command Flow
Detailed Data Flow
Bundler Data Flow
Simplified Architecture (RecipeResult-to-Template):
Key Simplification: Single RecipeResult path (no dual Recipe/RecipeResult routing)
Data Flow: RecipeResult → Values Map + ScriptData → Templates
Templates: Use index .Values "key" for config, .Script.* for metadata
Bundler Architecture
BaseBundler Helper Pattern
RecipeResult-Based Data Access
Data Flow: RecipeResult → Values/ScriptData → Template
Registry Pattern
DefaultBundler Options:
WithBundlerTypes([]BundleType)– Specify bundler types (empty = all registered)WithFailFast(bool)– Stop on first error (default: false/collect all)WithConfig(*Config)– Provide bundler configurationWithRegistry(*Registry)– Use custom bundler registry
Execution:
- Parallel execution by default: Uses
errgroup.WithContextfor concurrent execution- All bundlers run concurrently when no types specified
- Faster for multiple bundlers
- Context cancellation propagates to all bundlers
- Bundlers are stateless (thread-safe by design)
- BaseBundler provides thread-safe operations
Architecture Benefits:
- 75% less code per bundler (BaseBundler eliminates boilerplate)
- 34% less test code (TestHarness standardizes testing)
- 15+ internal helpers for recipe parsing
- Automatic registration via init() functions
- Fail-fast on duplicate bundler types
Usage Examples
Bundle Output Structure
Error Handling
Validation Errors:
- Missing recipe file: File not found error with path
- Invalid recipe format: Parse error with details
- Invalid bundler type: Error with list of supported types
- Empty measurements: Recipe validation failure
Execution Errors:
- FailFast=false (default): Collects all errors, continues execution
- Returns partial results with error list
- Exit code indicates failure count
- FailFast=true: Stops on first bundler error
- Returns immediately with error
- Subsequent bundlers not executed
Common Error Scenarios:
CLI Integration
The bundle command integrates with the CLI through:
- Shared Serializer: Uses same
serializer.FromFilefor recipe loading - Structured Logging: Consistent
slogstructured logging - Context Propagation: Respects context cancellation
- Error Patterns: Uses same error handling conventions
Log Output Example:
Common Errors:
Shared Infrastructure
Collector Factory Pattern
The CLI uses the Factory Pattern for collector instantiation, enabling:
- Testability: Inject mock collectors for unit tests
- Flexibility: Easy to add new collector types
- Encapsulation: Hide collector creation complexity
Serializer Abstraction
Output formatting is abstracted through the serializer.Serializer interface:
Implementations:
- JSON:
encoding/jsonwith 2-space indent - YAML:
gopkg.in/yaml.v3 - Table:
text/tabwriterfor columnar display
Measurement Data Model
All collected data uses a unified measurement.Measurement structure:
Error Handling
CLI Error Strategy
- Flag Validation: User-friendly error messages for invalid flags
- Version Parsing: Specific error types (ErrNegativeComponent, etc.)
- Collector Failures: Log errors, continue with partial data where possible
- Serialization Errors: Fatal - abort and report
- Exit Codes: Non-zero exit code on any failure
Example Error Messages
Performance Characteristics
Snapshot Command
- Parallel Collection: All collectors run concurrently via
errgroup - Typical Duration: 100-500ms depending on cluster size
- Memory Usage: ~10-50MB for typical workloads
- Scalability: O(n) with number of pods/nodes for K8s collector
Recipe Command
- Store Loading: Once per process (cached via
sync.Once) - Typical Duration: <10ms after initial load
- Memory Usage: ~5-10MB (embedded YAML + parsed structure)
- Scalability: O(m) with number of overlays (typically <100)
Build Configuration
Version Injection via ldflags
Build-time version information injection:
Testing Strategy
Unit Tests
- Flag parsing and validation
- Version parsing and error handling
- Query building from command flags
- Serializer format selection
Integration Tests
- Mock collectors for deterministic output
- Full command execution with fake factory
- Output format validation
Example Test Structure
Dependencies
External Libraries
github.com/urfave/cli/v3- CLI frameworkgolang.org/x/sync/errgroup- Concurrent error handlinggopkg.in/yaml.v3- YAML parsinglog/slog- Structured logging
Internal Packages
pkg/collector- System data collectionpkg/measurement- Data modelpkg/recipe- Recipe buildingpkg/version- Semantic versioningpkg/serializer- Output formattingpkg/logging- Logging configurationpkg/snapshotter- Snapshot orchestration
Future Enhancements
Short-Term (< 3 months)
-
Caching Layer
Rationale: Reduce latency for repeatedaicr snapshotcalls in scripts
Implementation:sync.Mapwith TTL-based eviction usingtime.AfterFunc
Trade-off: Stale data risk vs 5-10x performance improvement
Reference: sync.Map -
Differential Snapshots
Use Case: CI/CD pipelines detecting configuration drift
Implementation:github.com/google/go-cmp/cmpfor deep comparison
Output: JSON Patch (RFC 6902) format for machine consumption
CLI:aicr diff baseline.yaml current.yaml --format patch -
Measurement Filtering
Use Case: Extract only GPU data without K8s overhead
CLI:aicr snapshot --filter gpu,os --exclude k8s
Implementation: Post-collection filtering before serialization
Performance: Saves 60-70% execution time when K8s excluded -
Batch Mode
Use Case: Fleet-wide configuration auditing (100s of nodes)
Implementation: Worker pool witherrgroup.SetLimit()
CLI:aicr snapshot --nodes nodes.txt --workers 10 --output results/
Reference: errgroup Limits
Mid-Term (3-6 months)
-
Plugin System
Rationale: Custom collectors without forking codebase
Interface:type Collector interface { Collect(context.Context) (Measurement, error) }
Options: Go plugins (unstable across versions) or WASM (safe, portable)
Security: Sandboxed execution with restricted syscalls
Reference: WebAssembly System Interface -
Configuration Files
Use Case: Avoid repeating —os, —gpu flags
Format: YAML following XDG Base Directory spec
Location:~/.config/aicr/config.yaml(Linux/macOS),%APPDATA%\aicr\config.yaml(Windows)
Example: -
Watch Mode
Implementation: Hybrid offsnotify+ periodic polling
CLI:aicr snapshot --watch --interval 30s --on-change ./alert.sh
Output: Stream of JSON diffs to stdout
Use Case: Real-time monitoring with alerting -
Schema Validation
Use Case: Ensure snapshots conform to API version spec
Implementation: Embed JSON Schema in binary withgo:embed
Library:github.com/santhosh-tekuri/jsonschema/v5(fastest Go validator)
CLI:aicr validate --schema v1 snapshot.json
Long-Term (6-12 months)
-
gRPC Mode
Rationale: Better streaming, 3-5x smaller payloads than JSON
Implementation: Bi-directional streaming with protobuf
Trade-off: Added complexity (proto definitions) vs performance gains
Reference: gRPC Go -
Distributed Tracing
Use Case: Debug performance issues across collectors
Implementation: OpenTelemetry SDK with span per collector
Exporter: OTLP to Jaeger/Tempo
CLI:aicr snapshot --trace --trace-endpoint localhost:4317
Reference: OpenTelemetry Go -
Policy Enforcement
Use Case: Block non-compliant configs in CI/CD
Implementation: Embed OPA (github.com/open-policy-agent/opa)
CLI:aicr validate --policy policy.rego snapshot.yaml
Exit Code: 0 = pass, 1 = policy violations
Reference: OPA Go Integration -
Cloud Storage Integration
Use Case: Centralized storage for fleet management
CLI:aicr snapshot --upload s3://bucket/snapshots/$(hostname).yaml
Implementation: AWS SDK v2 with resumable uploads
Authentication: IAM roles, service accounts, credential chain
Reference: AWS SDK for Go V2
Production Deployment Patterns
Pattern 1: CI/CD Integration
Use Case: Automated configuration validation in build pipelines
GitLab CI Example:
GitHub Actions Example:
Jenkins Pipeline:
Pattern 2: Scheduled Auditing
Use Case: Nightly configuration drift detection across fleet
Kubernetes CronJob:
Systemd Timer (Bare Metal):
Enable with:
Pattern 3: Fleet Management
Use Case: Collect snapshots from 100s of GPU nodes in parallel
Ansible Playbook:
Terraform Provisioning:
Pattern 4: Real-Time Monitoring
Use Case: Continuous configuration monitoring with Prometheus alerting
Prometheus Exporter (future enhancement):
Prometheus Alerting Rules:
Advanced Usage Patterns
Snapshot Diffing with jq
Recipe Generation Pipeline
Automated Remediation
Troubleshooting Guide
Issue: “nvidia-smi not found”
Symptoms: GPU measurements empty, error in logs
Root Cause: NVIDIA driver not installed or not in PATH
Diagnosis:
Resolution:
Issue: “Kubernetes API server unreachable”
Symptoms: K8s measurements empty, “connection refused” error
Root Cause: Not running in cluster, or kubeconfig missing/invalid
Diagnosis:
Resolution:
Issue: “Snapshot too slow (> 5s)”
Symptoms: Long execution time, timeouts in CI/CD
Root Cause: Large cluster (1000s of pods), slow API server, many GPUs
Diagnosis:
Resolution:
Issue: “Out of memory during snapshot”
Symptoms: Process killed, OOMKilled in K8s, segfault
Root Cause: Large measurement data (10k+ pods, many images)
Diagnosis:
Resolution:
Performance Tuning
CPU Profiling
Memory Profiling
Benchmarking
Optimization Recommendations
-
Reduce String Allocations
Current:fmt.Sprintf("%s:%s", name, tag)allocates
Optimized: Usestrings.Builderfor concatenation
Savings: 20-30% fewer allocations in image collector -
Preallocate Slices
Current:measurements := []Measurement{}
Optimized:measurements := make([]Measurement, 0, expectedSize)
Benefit: Avoids slice growth reallocations
When: Size predictable (e.g., GPU count known) -
Pool Large Objects
Use Case: Measurement structs allocated repeatedly
Implementation:Reference: sync.Pool
-
Avoid Reflection
Current:encoding/jsonuses reflection
Optimized: Code-generated marshaling witheasyjson
Benefit: 2-3x faster JSON serialization
Trade-off: Build complexity vs performance
Reference: easyjson -
Batch API Operations
Current: Multiple API calls per collector
Optimized: Aggregate calls where possible
Example: List all pods once, filter in memory
Benefit: Reduces API server load, faster execution -
Concurrent Collectors
Current:errgroupwith limit
Tuning: Adjust limit based on collector typeReference: errgroup SetLimit
Security Best Practices
Running as Non-Root
CLI:
Kubernetes Job:
Secrets Management
Input Validation
CLI validates all inputs before processing:
Network Security
Bundler Framework: Components and Extension
The bundler framework documented under Bundle Command defines how individual components are turned into deployment artifacts. This section drills into the architecture diagrams, a worked example (GPU Operator), observability surfaces, the add-a-component workflow, and conventions for new bundlers. For command flow, flags, and usage examples, see the Bundle Command section above.
Component Diagram
The Generate README node here is the per-component bundle/<component>/README.md. The root bundle/README.md is generated by the deployer (see Deployer Framework below).
Sequence Diagram
Worked Example: GPU Operator Bundler
The GPU Operator bundler generates a complete deployment bundle for NVIDIA GPU Operator, extracting configuration from recipe measurements.
Recipe Data Extraction
K8s Measurements (measurement.TypeK8s):
-
Image Subtype — Component versions:
-
Config Subtype — Boolean flags:
GPU Measurements (measurement.TypeGPU):
Template Files
values.yaml.tmpl — Helm chart values:
install.sh.tmpl — Installation script:
Observability
Metrics
Prometheus metrics exposed by the bundler framework:
Structured Logging
slog integration for structured log output:
Adding New Components
Adding a new component requires no Go code. Components are configured declaratively:
-
Add to Component Registry (
recipes/registry.yaml): -
Create Values File (
recipes/components/my-operator/values.yaml): -
Add to Recipe Overlay (
recipes/overlays/<overlay>.yaml): -
Test the Component:
See Bundler Development Guide for detailed documentation.
Best Practices
Template Design:
- Keep templates simple and focused
- Use descriptive variable names
- Add comments for complex logic
- Validate template rendering in tests
- Don’t put business logic in templates
Error Handling:
- Use structured errors with context (
pkg/errors) - Wrap errors with meaningful messages
- Validate early (before starting generation)
- Clean up resources on error
- Don’t swallow errors silently
Testing:
- Test with realistic recipe data
- Use table-driven tests for coverage
- Test error paths explicitly
- Verify generated file content
- Don’t skip integration tests
Performance:
- Use parallel generation for multiple files
- Stream large files instead of buffering
- Reuse template instances when possible
- Profile bundle generation for bottlenecks
- Don’t generate synchronously without reason
Deployer Framework: GitOps Integration
The bundle command integrates with GitOps tools through the Deployer Framework, which generates deployment-specific artifacts alongside the standard bundle files.
Overview
Purpose: Generate GitOps-ready deployment artifacts that integrate with popular continuous delivery tools.
Supported Deployers:
Key Feature: Deployment Order
All deployers respect the deploymentOrder field from the recipe, ensuring components are installed in the correct sequence:
Deployer Architecture
Argo CD Deployer
Generates Argo CD Application manifests with proper sync ordering using multi-source Applications.
Ordering Mechanism: Uses argocd.argoproj.io/sync-wave annotation.
Output Structure:
Helm Deployer (Default)
Generates a Helm per-component bundle with individual component directories.
Ordering Mechanism: Dependencies listed in Chart.yaml are deployed in order by Helm.
Output Structure:
Deployer Data Flow
Usage Examples
Deployment Order Implementation
The orderComponentsByDeployment function ensures components are processed in the correct sequence:
Testing Deployers
Each deployer has tests verifying deployment order correctness:
References
Official Documentation
- urfave/cli Framework - CLI framework used by aicr
- errgroup Patterns - Concurrent error handling
- YAML v3 Library - YAML parsing and serialization
- Structured Logging (slog) - Standard library logging
- Context Package - Cancellation and deadlines
Kubernetes Integration
- client-go Documentation - Official K8s client
- Dynamic Client - Unstructured resource access
- CronJob Best Practices - Scheduled job patterns
- RBAC Authorization - Permission model
NVIDIA Tools
- NVIDIA SMI - GPU management
- NVML Library - Programmatic GPU access
- CUDA Toolkit - GPU computing platform
- GPU Operator - K8s GPU automation
Best Practices
- Semantic Versioning - Version comparison algorithm
- The Twelve-Factor App - Cloud-native application patterns
- Release Engineering Best Practices - Google SRE
- Go Code Review Comments - Idiomatic Go
Security
- OWASP Secure Coding Practices
- Kubernetes Pod Security Standards
- NIST 800-190: Container Security
- CIS Benchmarks - Security configuration baselines