I just upgraded kube-prometheus-stack
using Helm chart on my Kubernetes cluster using Terraform and started seeing the following 2 errors:
failed calling webhook "prometheusrulemutate.monitoring.coreos.com":
failed to call webhook:
Post "https://kube-prometheus-stack-operator.infra.svc:443/admission-prometheusrules/mutate?timeout=30s":
x509: certificate signed by unknown authority
Error: cannot patch "kube-prometheus-stack-prometheus-node-exporter" with kind DaemonSet:
DaemonSet.apps "kube-prometheus-stack-prometheus-node-exporter" is invalid: spec.selector:
Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"app.kubernetes.io/instance":"kube-prometheus-stack", "app.kubernetes.io/name":"prometheus-node-exporter"},
MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable &&
cannot patch "kube-prometheus-stack-kube-state-metrics" with kind Deployment:
Deployment.apps "kube-prometheus-stack-kube-state-metrics" is invalid: spec.selector:
Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"app.kubernetes.io/instance":"kube-prometheus-stack", "app.kubernetes.io/name":"kube-state-metrics"},
MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable
You can find the upgrade path to see both older and newer versions of image and chart of Kube Prometheus Stack in the table below:
Component | Old Version (Dated: 14 April, 2021) | New Version (Dated: 24 July, 2023) |
---|---|---|
Image | v0.46.0 | v0.66.0 |
Chart | 14.9.0 | 48.2.0 |
How to fix those 2 errors?
To fix the 1st error, I changed the prometheusOperator
configuration to set failurePolicy
to Ignore
under admissionWebhooks
in the default values file for Helm chart as follows:
prometheusOperator:
enabled: true
admissionWebhooks:
"Fail"
failurePolicy: "Ignore"
To fix the 2nd error, I disabled both kubeStateMetrics
and nodeExporter
configuration in the default values file for Helm chart by setting enabled
to false
and then applied the Helm chart and then enabled both by setting enabled
to true
and then applied the Helm chart and that worked. Maybe, deletion of resources was required for a successful installation in the newer version. Not sure what caused that. Maybe some incorrect configuration in the default values file during upgrade.
Reference: kube-prometheus-stack / v48.2.0