Observability
This guide covers the steps to enable observability components such as Prometheus, Grafana, Parca and kube-state-metrics for the operator. By default, these components are disabled in the Helm chart and need to be enabled explicitly.
Purpose of kube-state-metrics
kube-state-metrics
is a service that listens to the Kubernetes API server and generates metrics about the state of the objects (such as deployments, nodes, pods, etc.) managed by Kubernetes. In the context of our operator, we deploy kube-state-metrics
to expose detailed metrics for the custom resource definitions (CRDs) managed by the operator.
Enabling Observability Components
To enable Prometheus, Grafana, Parca and Kube-State-Metrics, you can modify the values.yaml
file as shown below:
prometheus:
enabled: true
grafana:
enabled: true
kube-state-metrics:
enabled: true
parca:
enabled: true
Alternatively, if you have deployed your operator already you can enable these components using Helm command-line options. This can be helpful for testing the monitoring stack. Please don't use this in production.
helm -n dpf-operator-system \
upgrade dpf-operator dpf-repository/dpf-operator \
--version=v0.1.0-latest \
--values <(helm -n dpf-operator-system get values dpf-operator) \
--set
grafana.enabled=true
\
--set
prometheus.enabled=true
\
--set
parca.enabled=true
\
--set
kube-state-metrics.enabled=true
Included Grafana Dashboards
We have preconfigured three Grafana dashboards to provide monitoring and insights into the operator and its controllers:
1) DOCA Platform Framework State: This dashboard provides a high-level overview of the operator and its controllers, highlighting key metrics such as resource status, condition states, and time to readiness. 2) Controller Runtime Dashboard: This dashboard provides detailed metrics and visualizations for the controllers, including information on reconciliation times, queue depths, and error rates. 3) Kubernetes API Server Requests Dashboard: This dashboard monitors the requests made to the Kubernetes API server, helping you to identify any performance bottlenecks or excessive API usage.
These dashboards are automatically deployed when Grafana is enabled. Once enabled, you can access them through the Grafana web UI under the "Dashboards" section.
Setting the Grafana Admin Password
You can set the Grafana admin password manually by configuring the grafana.adminPassword
value in the values.yaml
file:
grafana:
adminPassword: <your-password>
Alternatively, if you prefer Grafana to generate a custom password, leave grafana.adminPassword
unset. After deployment, you can retrieve the autogenerated password using the following command:
kubectl -n dpf-operator-system get secret dpf-operator-grafana -ojsonpath='{.data.admin-password}'
| base64 -d
This command fetches the password from the Kubernetes secret created for Grafana.
Note on Storage Solution
By default, Grafana and Prometheus use hostPath
for storage. This is not recommended for production environments due to the potential for data loss and lack of scalability. You should configure a more reliable storage solution.
To change the storage solution, you need to modify the respective storage configurations in the values.yaml
file:
prometheus:
server:
persistentVolume:
enabled: true
storageClass: <your-storage-class
>
grafana:
persistence:
enabled: true
storageClassName: <your-storage-class
>
Make sure to replace <your-storage-class>
with the appropriate storage class for your environment.
Parca also uses local storage by default. Parca supports an S3 storage bucket for storing larger collections of profile data.