kubernetesmetrics-server

Kubernetes metrics-server FailedDiscoveryCheck


was hoping to get a little help, my Google-Fu didnt get me much closer. I'm trying to install the metrics server for my fedora-coreos kubernetes 4 node cluster like so:

kubectl apply -f deploy/kubernetes/
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
serviceaccount/metrics-server created
deployment.apps/metrics-server created
service/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created

the service seems to never start

kubectl describe apiservice v1beta1.metrics.k8s.io
Name:         v1beta1.metrics.k8s.io
Namespace:
Labels:       <none>
Annotations:  kubectl.kubernetes.io/last-applied-configuration:
                {"apiVersion":"apiregistration.k8s.io/v1beta1","kind":"APIService","metadata":{"annotations":{},"name":"v1beta1.metrics.k8s.io"},"spec":{"...
API Version:  apiregistration.k8s.io/v1
Kind:         APIService
Metadata:
  Creation Timestamp:  2020-03-04T16:53:33Z
  Resource Version:    1611816
  Self Link:           /apis/apiregistration.k8s.io/v1/apiservices/v1beta1.metrics.k8s.io
  UID:                 65d9a56a-c548-4d7e-a647-8ce7a865a266
Spec:
  Group:                     metrics.k8s.io
  Group Priority Minimum:    100
  Insecure Skip TLS Verify:  true
  Service:
    Name:            metrics-server
    Namespace:       kube-system
    Port:            443
  Version:           v1beta1
  Version Priority:  100
Status:
  Conditions:
    Last Transition Time:  2020-03-04T16:53:33Z
    Message:               failing or missing response from https://10.3.230.59:443/apis/metrics.k8s.io/v1beta1: bad status from https://10.3.230.59:443/apis/metrics.k8s.io/v1beta1: 403
    Reason:                FailedDiscoveryCheck
    Status:                False
    Type:                  Available
Events:                    <none>

Diagnosing I have found googling around:

kubectl get deploy,svc -n kube-system |egrep metrics-server
deployment.apps/metrics-server   1/1     1            1           8m7s
service/metrics-server   ClusterIP   10.3.230.59   <none>        443/TCP         8m7s

kubectl get --raw "/apis/metrics.k8s.io/v1beta1/nodes"
Error from server (ServiceUnavailable): the server is currently unable to handle the request

kubectl get all --all-namespaces | grep -i metrics-server
kube-system      pod/metrics-server-75b5d446cd-zj4jm                              1/1     Running   0          9m11s
kube-system   service/metrics-server   ClusterIP      10.3.230.59    <none>        443/TCP                                     9m11s
kube-system      deployment.apps/metrics-server   1/1     1            1           9m11s
kube-system      replicaset.apps/metrics-server-75b5d446cd   1         1         1       9m11s

kubectl logs -f metrics-server-75b5d446cd-zj4jm -n kube-system
I0304 16:53:36.475657       1 serving.go:312] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)
W0304 16:53:38.229267       1 authentication.go:296] Cluster doesn't provide requestheader-client-ca-file in configmap/extension-apiserver-authentication in kube-system, so request-header client certificate authentication won't work.
I0304 16:53:38.267760       1 secure_serving.go:116] Serving securely on [::]:4443

kubectl get -n kube-system deployment metrics-server -o yaml | grep -i args -A 10
      {"apiVersion":"apps/v1","kind":"Deployment","metadata":{"annotations":{},"labels":{"k8s-app":"metrics-server"},"name":"metrics-server","namespace":"kube-system"},"spec":{"selector":{"matchLabels":{"k8s-app":"metrics-server"}},"template":{"metadata":{"labels":{"k8s-app":"metrics-server"},"name":"metrics-server"},"spec":{"containers":[{"args":["--cert-dir=/tmp","--secure-port=4443","--kubelet-insecure-tls","--kubelet-preferred-address-types=InternalIP"],"image":"k8s.gcr.io/metrics-server-amd64:v0.3.6","imagePullPolicy":"IfNotPresent","name":"metrics-server","ports":[{"containerPort":4443,"name":"main-port","protocol":"TCP"}],"securityContext":{"readOnlyRootFilesystem":true,"runAsNonRoot":true,"runAsUser":1000},"volumeMounts":[{"mountPath":"/tmp","name":"tmp-dir"}]}],"nodeSelector":{"beta.kubernetes.io/os":"linux","kubernetes.io/arch":"amd64"},"serviceAccountName":"metrics-server","volumes":[{"emptyDir":{},"name":"tmp-dir"}]}}}}
  creationTimestamp: "2020-03-04T16:53:33Z"
  generation: 1
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
  resourceVersion: "1611810"
  selfLink: /apis/apps/v1/namespaces/kube-system/deployments/metrics-server
  uid: 006e758e-bd33-47d7-8378-d3a8081ee8a8
spec:
--
      - args:
        - --cert-dir=/tmp
        - --secure-port=4443
        - --kubelet-insecure-tls
        - --kubelet-preferred-address-types=InternalIP
        image: k8s.gcr.io/metrics-server-amd64:v0.3.6
        imagePullPolicy: IfNotPresent
        name: metrics-server
        ports:
        - containerPort: 4443
          name: main-port

finally my deployment config:

 spec:
  selector:
    matchLabels:
      k8s-app: metrics-server
  template:
    metadata:
      name: metrics-server
      labels:
        k8s-app: metrics-server
    spec:
      serviceAccountName: metrics-server
      volumes:
      # mount in tmp so we can safely use from-scratch images and/or read-only containers
      - name: tmp-dir
        emptyDir: {}
      containers:
      - name: metrics-server
        image: k8s.gcr.io/metrics-server-amd64:v0.3.6
        command:
          - /metrics-server
          - --kubelet-insecure-tls
          - --kubelet-preferred-address-types=InternalIP
        args:
          - --cert-dir=/tmp
          - --secure-port=4443
          - --kubelet-insecure-tls
          - --kubelet-preferred-address-types=InternalIP
        ports:
        - name: main-port
          containerPort: 4443
          protocol: TCP
        securityContext:
          readOnlyRootFilesystem: true
          runAsNonRoot: true
          runAsUser: 1000
        imagePullPolicy: IfNotPresent
        volumeMounts:
        - name: tmp-dir
          mountPath: /tmp
      hostNetwork: true
      nodeSelector:
        beta.kubernetes.io/os: linux
        kubernetes.io/arch: "amd64"

I'm at a loss of what it could be getting the metrics service to start and just get the basic kubectl top node to display any info all I get is

Error from server (ServiceUnavailable): the server is currently unable to handle the request (get pods.metrics.k8s.io)

I have searched the internet and tried adding the args: and command: lines but no luck

command:
           - /metrics-server
           - --kubelet-insecure-tls
           - --kubelet-preferred-address-types=InternalIP
args:
          - --cert-dir=/tmp
          - --secure-port=4443
          - --kubelet-insecure-tls
          - --kubelet-preferred-address-types=InternalIP

Can anyone shed light on how to fix this? Thanks

Pastebin log file Log File


Solution

  • I've reproduced your issue. I have used Calico as CNI.

    $ kubectl get nodes
    NAME              STATUS   ROLES    AGE     VERSION
    fedora-master     Ready    master   6m27s   v1.17.3
    fedora-worker-1   Ready    <none>   4m48s   v1.17.3
    fedora-worker-2   Ready    <none>   4m46s   v1.17.3
    
    fedora-master:~/metrics-server$ kubectl describe apiservice v1beta1.metrics.k8s.io
    Status:
      Conditions:
        Last Transition Time:  2020-03-12T16:04:59Z
        Message:               failing or missing response from https://10.99.122.196:443/apis/metrics.k8s.io/v
    1beta1: Get https://10.99.122.196:443/apis/metrics.k8s.io/v1beta1: net/http: request canceled while waiting
     for connection (Client.Timeout exceeded while awaiting headers)
    
    fedora-master:~/metrics-server$ kubectl top pod
    Error from server (ServiceUnavailable): the server is currently unable to handle the request (get pods.metrics.k8s.io)
    

    When you have only one node in cluster, default settings in metrics-server repo works correctly. Issue occurs when you have more than 2 nodes. Ive used 1 master and 2 workers to reproduce. Below example deployment which works correct (have all required args). Before, please remove your current metrics-server YAMLs (kubectl delete -f deploy/kubernetes) and execute:

    $ git clone https://github.com/kubernetes-sigs/metrics-server
    $ cd metrics-server/deploy/kubernetes/
    $ vi metrics-server-deployment.yaml
    

    Paste below YAML:

    ---
    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: metrics-server
      namespace: kube-system
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: metrics-server
      namespace: kube-system
      labels:
        k8s-app: metrics-server
    spec:
      selector:
        matchLabels:
          k8s-app: metrics-server
      template:
        metadata:
          name: metrics-server
          labels:
            k8s-app: metrics-server
        spec:
          serviceAccountName: metrics-server
          volumes:
          # mount in tmp so we can safely use from-scratch images and/or read-only containers
          - name: tmp-dir
            emptyDir: {}
          hostNetwork: true
          containers:
          - name: metrics-server
            image: k8s.gcr.io/metrics-server-amd64:v0.3.6
            imagePullPolicy: IfNotPresent
            args:
              - /metrics-server
              - --kubelet-preferred-address-types=InternalIP
              - --kubelet-insecure-tls
              - --cert-dir=/tmp
              - --secure-port=4443
            ports:
            - name: main-port
              containerPort: 4443
              protocol: TCP
            securityContext:
              readOnlyRootFilesystem: true
              runAsNonRoot: true
              runAsUser: 1000
            volumeMounts:
            - name: tmp-dir
              mountPath: /tmp
          nodeSelector:
            kubernetes.io/os: linux
            kubernetes.io/arch: "amd64"
    

    save and quit using :wq

    $ cd ~/metrics-server
    $ kubectl apply -f deploy/kubernetes/
    clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
    clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
    rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
    apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created
    serviceaccount/metrics-server created
    deployment.apps/metrics-server created
    service/metrics-server created
    clusterrole.rbac.authorization.k8s.io/system:metrics-server created
    clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
    

    Wait a while for metrics-server to gather a few metrics from nodes.

    $ kubectl describe apiservice v1beta1.metrics.k8s.io
    Name:         v1beta1.metrics.k8s.io
    Namespace:    
    ...
    Metadata:
      Creation Timestamp:  2020-03-12T16:57:58Z
    ...
    Spec:
      Group:                     metrics.k8s.io
      Group Priority Minimum:    100
      Insecure Skip TLS Verify:  true
      Service:
        Name:            metrics-server
        Namespace:       kube-system
        Port:            443
      Version:           v1beta1
      Version Priority:  100
    Status:
      Conditions:
        Last Transition Time:  2020-03-12T16:58:01Z
        Message:               all checks passed
        Reason:                Passed
        Status:                True
        Type:                  Available
    Events:                    <none>
    

    after a few minutes you can use top.

    $ kubectl top nodes
    NAME              CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
    fedora-master     188m         9%     1315Mi          17%       
    fedora-worker-1   109m         5%     982Mi           13%       
    fedora-worker-2   84m          4%     969Mi           13%   
    

    If you will still encounter some issues, please add - --v=6 to deployment and provide logs from metrics-server pod.

    containers:
          - name: metrics-server
            image: k8s.gcr.io/metrics-server-amd64:v0.3.1
            args:
              - /metrics-server
              - --v=6
              - --kubelet-preferred-address-types=InternalIP
              - --kubelet-insecure-tls