Problem: For some reason, helm release of kube-prometheus-stack
is stuck in Pending-install
status. What is the correct to install a helm release for this using helm cli
?
Details:
Due to Docker registry k8s.gcr.io
getting frozen, I had to update the Docker image registry to registry.k8s.io
for kube-state-metrics
by updating the values.yaml
as follows:
kube-state-metrics:
prometheusScrape: true
image:
repository: registry.k8s.io/kube-state-metrics/kube-state-metrics
tag: v1.9.8
pullPolicy: Always
namespaceOverride: ""
rbac:
create: true
podSecurityPolicy:
enabled: true
After that, when I tried update the helm release for kube-prometheus-stack
using same version of 14.9.0
, it failed with status Failed
for helm release. Upon retrying, it deleted the previous helm release and created a new one. All the components by the new one created successfully but the helm release got stuck in the Pending-install
status.
I waited for almost 30 minutes but no success. I also tried deleting helm release, rollbacking helm release, deleting helm release secret but got no success.
What could be the issue? How can I solve it?
Solution: After some investigation, I found that there was a job named kube-prometheus-stack-admission-patch
which was failing with BackoffLimitExceeded
error. It was some kind of an initializing job. Deleting the job (not pod) fixed the issue and the helm release changed its status to Deployed
.
Error Log in kube-prometheus-stack-admission-patch
job:
W0331 10:58:03.079451 1 client_config.go:608] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
{"level":"info","msg":"patching webhook configurations 'kube-prometheus-stack-admission' mutating=true, validating=true, failurePolicy=Fail","source":"k8s/k8s.go:39","time":"2023-03-31T10:58:03Z"}
{"err":"the server could not find the requested resource","level":"fatal","msg":"failed getting validating webhook","source":"k8s/k8s.go:48","time":"2023-03-31T10:58:03Z"}