kuberneteslinkerdmetrics-server

Kubernetes metrics-server not working with Linkerd


I have a metrics-server and a horizontal pod autoscaler using this server, running on my cluster.
This works perfectly fine, until i inject linkerd-proxies into the deployments of the namespace where my application is running. Running kubectl top pod in that namespace results in a error: Metrics not available for pod <name> error. However, nothing appears in the metrics-server pod's logs.
The metrics-server clearly works fine in other namespaces, because top works in every namespace but the meshed one.

At first i thought it could be because the proxies' resource requests/limits weren't set, but after running the injection with them (kubectl get -n <namespace> deploy -o yaml | linkerd inject - --proxy-cpu-request "10m" --proxy-cpu-limit "1" --proxy-memory-request "64Mi" --proxy-memory-limit "256Mi" | kubectl apply -f -), the issue stays the same.

Is this a known problem, are there any possible solutions?

PS: I have a kube-prometheus-stack running in a different namespace, and this seems to be able to scrape the pod metrics from the meshed pods just fine grafana dashboard image showing prometheus can collect the data


Solution

  • The problem was apparently a bug in the cAdvisor stats provider with the CRI runtime. The linkerd-init containers keep producing metrics after they've terminated, which shouldn't happen. The metrics-server ignores stats from pods that contain containers that report zero values (to avoid reporting invalid metrics, like when a container is restarting, metrics aren't collected yet,...). You can follow up on the issue here. Solutions seem to be changing to another runtime or using the PodAndContainerStatsFromCRI flag, which will let the internal CRI stats provider be responsible instead of the cAdvisor one.