Setting up Prometheus¶
Implementing a Prometheus stack can be complicated but can be managed by taking advantage of the Helm package manager and
the Prometheus Operator and kube-prometheus projects.
The Operator uses standard configurations and dashboards for Prometheus and Grafana and the Helm prometheus-operator
chart allows you to get a full cluster monitoring solution up and running by installing Prometheus Operator and the rest of the components listed above.
First, add the helm repo:
$ helm repo add prometheus-community \
https://prometheus-community.github.io/helm-charts
Now, search for the available prometheus charts:
$ helm search repo kube-prometheus
Once you’ve located which the version of the chart to use, inspect the chart so we can modify the settings:
$ helm inspect values prometheus-community/kube-prometheus-stack > /tmp/kube-prometheus-stack.values
Next, we’ll need to edit the values file to change the port at which the Prometheus server service is available. In the prometheus instance
section of the chart, change the service type from ClusterIP to NodePort. This will allow the Prometheus server to be accessible at your
machine ip address at port 30090 as http://<machine-ip>:30090/
From:
## Port to expose on each node
## Only used if service.type is 'NodePort'
##
nodePort: 30090
## Loadbalancer IP
## Only use if service.type is "loadbalancer"
loadBalancerIP: ""
loadBalancerSourceRanges: []
## Service type
##
type: ClusterIP
To:
## Port to expose on each node
## Only used if service.type is 'NodePort'
##
nodePort: 30090
## Loadbalancer IP
## Only use if service.type is "loadbalancer"
loadBalancerIP: ""
loadBalancerSourceRanges: []
## Service type
##
type: NodePort
Also, modify the prometheusSpec.serviceMonitorSelectorNilUsesHelmValues settings to false below:
## If true, a nil or {} value for prometheus.prometheusSpec.serviceMonitorSelector will cause the
## prometheus resource to be created with selectors based on values in the helm deployment,
## which will also match the servicemonitors created
##
serviceMonitorSelectorNilUsesHelmValues: false
Add the following configMap to the section on additionalScrapeConfigs in the Helm chart:
## AdditionalScrapeConfigs allows specifying additional Prometheus scrape configurations. Scrape configurations
## are appended to the configurations generated by the Prometheus Operator. Job configurations must have the form
## as specified in the official Prometheus documentation:
## https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config. As scrape configs are
## appended, the user is responsible to make sure it is valid. Note that using this feature may expose the possibility
## to break upgrades of Prometheus. It is advised to review Prometheus release notes to ensure that no incompatible
## scrape configs are going to break Prometheus after the upgrade.
##
## The scrape configuration example below will find master nodes, provided they have the name .*mst.*, relabel the
## port to 2379 and allow etcd scraping provided it is running on all Kubernetes master nodes
##
additionalScrapeConfigs:
- job_name: gpu-metrics
scrape_interval: 1s
metrics_path: /metrics
scheme: http
kubernetes_sd_configs:
- role: endpoints
namespaces:
names:
- gpu-operator-resources
relabel_configs:
- source_labels: [__meta_kubernetes_pod_node_name]
action: replace
target_label: kubernetes_node
Finally, we can deploy the Prometheus and Grafana pods using the kube-prometheus-stack via Helm:
$ helm install prometheus-community/kube-prometheus-stack \
--create-namespace --namespace prometheus \
--generate-name \
--values /tmp/kube-prometheus-stack.values
Note
You can also override values in the Prometheus chart directly on the Helm command line:
$ helm install prometheus-community/kube-prometheus-stack \
--create-namespace --namespace prometheus \
--generate-name \
--set prometheus.service.type=NodePort \
--set prometheus.prometheusSpec.serviceMonitorSelectorNilUsesHelmValues=false
You should see a console output as below:
NAME: kube-prometheus-stack-1603211794
LAST DEPLOYED: Tue Oct 20 16:36:39 2020
NAMESPACE: prometheus
STATUS: deployed
REVISION: 1
NOTES:
kube-prometheus-stack has been installed. Check its status by running:
kubectl --namespace prometheus get pods -l "release=kube-prometheus-stack-1603211794"
Visit https://github.com/prometheus-operator/kube-prometheus for instructions on how to create & configure Alertmanager and Prometheus instances using the Operator.