The Kubernetes Object Monitor module watches Kubernetes resources and generates health events when resources enter unhealthy states. This document covers all Helm configuration options for system administrators.
Controls whether the kubernetes-object-monitor module is deployed in the cluster.
Defines CPU and memory resource requests and limits for the kubernetes-object-monitor pod.
Sets the verbosity level for kubernetes-object-monitor logs.
Controls behavior of the Kubernetes controller watching resources.
Maximum number of concurrent reconciliation workers. Higher values allow parallel processing of multiple resources.
How often the controller re-evaluates all watched resources even without changes.
Policies define which Kubernetes resources to monitor and when to generate health events.
Unique identifier for the policy used in logs and metrics.
Enables or disables the policy. Disabled policies are not compiled or evaluated.
Specifies the Kubernetes resource type to monitor.
API group of the resource. Use empty string "" for core resources (Pod, Node, Service, etc.).
API version of the resource (e.g., v1, v1beta1).
Kubernetes Kind of the resource (e.g., Node, Pod, Deployment).
CEL expression that evaluates to true when the resource is in an unhealthy state. Evaluated with resource variable containing the full resource object.
CEL expression accessing the resource via resource variable.
Optional CEL expression that maps the resource to a specific Kubernetes node name.
CEL expression that returns a string node name.
Defines the health event to generate when the predicate matches.
Component type for the health event (e.g., Node, GPU, Pod).
Boolean indicating if this is a fatal error that should trigger quarantine.
Human-readable error message included in the health event.
Action code from health event proto (see health_event.proto).
Array of error code strings for categorization and filtering.
Optional behavior override for fault-quarantine. force forces node cordoning regardless of normal rules; skip skips node cordoning for the generated health event. Set at most one of force or skip.
Optional behavior override for node-drainer. force forces immediate pod eviction regardless of configured namespace drain modes; skip skips pod eviction and marks the event as already drained. Set at most one of force or skip.
Access the resource object via the resource variable.
Check if condition exists and is True:
Check field value:
Check label exists:
Map resources to nodes using CEL expressions.
The lookup() function retrieves other Kubernetes resources during evaluation.
version (string) - API version (e.g., “v1”, “apps/v1”)kind (string) - Resource Kind (e.g., “Pod”, “Node”)namespace (string) - Namespace (use empty string "" for cluster-scoped resources)name (string) - Resource nameGet node from pod reference:
Monitor nodes that are not in Ready state.
Monitor custom node conditions.
RBAC permissions are automatically generated based on configured policies:
When adding a new policy for a Custom Resource, ensure the CRD is installed before deploying the kubernetes-object-monitor.
nodeAssociation for non-Node resources to enable quarantineisFatal: true only for errors requiring node quarantine