spring-boot kubernetes autoscaling horizontal-pod-autoscaling

K8s Horizontal pod autoscaling not working

I am following this tutorial to try out k8s hirizontal pod autoscaling.

I have following k8s manifest:

apiVersion: v1 
kind: Service 
metadata: 
  name: springboot-k8s-svc
spec:
  selector:
    app: spring-boot-k8s
  ports:
    - protocol: "TCP"
      port: 8080 
      targetPort: 8080 
  type: NodePort 
---
apiVersion: apps/v1
kind: Deployment 
metadata:
  name: spring-boot-k8s
spec:
  selector:
    matchLabels:
      app: spring-boot-k8s
  replicas: 1 
  template:
    metadata:
      labels:
        app: spring-boot-k8s
    spec:
      containers:
        - name: spring-boot-k8s
          image: springboot-k8s-example:1.0 
          imagePullPolicy: IfNotPresent
          ports:
            - containerPort: 8080

Currently I have following things running in my minikube:

$ kubectl get all
NAME                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
service/kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   4h52m

I start my dummy spring boot application:

$ kubectl apply -f deployment-n-svc.yaml 
service/springboot-k8s-svc created
deployment.apps/spring-boot-k8s created

This app seem to start as desired appropriately:

$ kubectl get all
NAME                                  READY   STATUS    RESTARTS   AGE
pod/spring-boot-k8s-bccc4c557-7wbrn   1/1     Running   0          5s

NAME                         TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)          AGE
service/kubernetes           ClusterIP   10.96.0.1      <none>        443/TCP          4h53m
service/springboot-k8s-svc   NodePort    10.99.136.27   <none>        8080:30931/TCP   5s

NAME                              READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/spring-boot-k8s   1/1     1            1           5s

NAME                                        DESIRED   CURRENT   READY   AGE
replicaset.apps/spring-boot-k8s-bccc4c557   1         1         1       5s

When I try to hit REST end point, I get the desired output:

$ curl http://192.168.49.2:30931/message
OK!

Now I tried to autoscale the app:

$ kubectl autoscale deployment spring-boot-k8s --min=1 --max=5 --cpu-percent=10
horizontalpodautoscaler.autoscaling/spring-boot-k8s autoscaled

Started watching the hpa just started as shown below command. It seems to have started:

$ watch -n 1 kubectl get hpa

Every 1.0s: kubectl get hpa

NAME                                                  REFERENCE                    TARGETS         MINPODS   MAXPODS   REPLIC
AS   AGE
horizontalpodautoscaler.autoscaling/spring-boot-k8s   Deployment/spring-boot-k8s   <unknown>/10%   1         5         1
     8m10s

Then I tried apache bench HTTP load test utility to create load on the spring boot server to check if k8s increases the number of pods:

$ab -n 1000000 -c 100 http://192.168.49.2:30931/message

However this did not increase number of pods. What I am missing?

PS:

When I kill ab command in with Ctrl+C, it gives following output (notice aprox 5s processing time per request):

$ ab -n 1000000 -c 100 http://192.168.49.2:32215/message
This is ApacheBench, Version 2.3 <$Revision: 1843412 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 192.168.49.2 (be patient)
^C

Server Software:        
Server Hostname:        192.168.49.2
Server Port:            32215

Document Path:          /message
Document Length:        4 bytes

Concurrency Level:      100
Time taken for tests:   35.650 seconds
Complete requests:      601
Failed requests:        0
Total transferred:      81736 bytes
HTML transferred:       2404 bytes
Requests per second:    16.86 [#/sec] (mean)
Time per request:       5931.751 [ms] (mean)
Time per request:       59.318 [ms] (mean, across all concurrent requests)
Transfer rate:          2.24 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    1   1.5      0       7
Processing:  5001 5004   4.7   5003    5022
Waiting:     5000 5004   3.9   5002    5019
Total:       5001 5006   5.4   5003    5024

Percentage of the requests served within a certain time (ms)
  50%   5003
  66%   5005
  75%   5007
  80%   5009
  90%   5013
  95%   5020
  98%   5023
  99%   5023
 100%   5024 (longest request)

Update

As asked in comments, here is output of some more commands:

$ kubectl describe hpa spring-boot-k8s
Warning: autoscaling/v2beta2 HorizontalPodAutoscaler is deprecated in v1.23+, unavailable in v1.26+; use autoscaling/v2 HorizontalPodAutoscaler
Name:                                                  spring-boot-k8s
Namespace:                                             default
Labels:                                                <none>
Annotations:                                           <none>
CreationTimestamp:                                     Fri, 03 Feb 2023 01:58:06 +0530
Reference:                                             Deployment/spring-boot-k8s
Metrics:                                               ( current / target )
  resource cpu on pods  (as a percentage of request):  <unknown> / 10%
Min replicas:                                          1
Max replicas:                                          5
Deployment pods:                                       1 current / 0 desired
Conditions:
  Type           Status  Reason                   Message
  ----           ------  ------                   -------
  AbleToScale    True    SucceededGetScale        the HPA controller was able to get the target's current scale
  ScalingActive  False   FailedGetResourceMetric  the HPA was unable to compute the replica count: failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API
Events:
  Type     Reason                   Age                  From                       Message
  ----     ------                   ----                 ----                       -------
  Warning  FailedGetResourceMetric  35m (x500 over 16h)  horizontal-pod-autoscaler  failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API

Notice what it says: autoscaling/v2beta2 HorizontalPodAutoscaler is deprecated in v1.23+, unavailable in v1.26+; use autoscaling/v2 HorizontalPodAutoscaler.

Also it says: no metrics returned from resource metrics API. Thought my metric server is running:

$ kubectl get deployment -n kube-system
NAME             READY   UP-TO-DATE   AVAILABLE   AGE
coredns          1/1     1            1           304d
metrics-server   1/1     1            1           40h

This seems to be the reason why its not working. But what could be the reason?

Apart from that the CPU utilization is not also increasing much. Till yesterday night, I used to see max 3% CPU utilization in the output of command watch kubectl top node:

NAME       CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
minikube   141m         1%     1127Mi          7%

But now it shows following error:

Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)

Solution

When I run kubectl get pod it gave me following output:

$ kubectl get pod
NAME                               READY   STATUS    RESTARTS      AGE
spring-boot-k8s-556b578645-pwdhs   1/1     Running   3 (49m ago)   31d

But, I was not able to get utilization of this pod even after specifying its name in different ways. It gave error: invalid resource name:

$ kubectl top pod pod/spring-boot-k8s -n efault --containers
error: invalid resource name "pod/spring-boot-k8s": [may not contain '/']

$ kubectl top pod spring-boot-k8s -n efault --containers
Error from server (NotFound): pod "spring-boot-k8s" not found

$ kubectl top pod spring-boot-k8s-556b578645-pwdhs -n efault --containers
Error from server (NotFound): pod "spring-boot-k8s-556b578645-pwdhs" not found

$ kubectl top pod pod/spring-boot-k8s-556b578645-pwdhs -n efault --containers
error: invalid resource name "pod/spring-boot-k8s-556b578645-pwdhs": [may not contain '/']

My metric server was already running:

$ kubectl get all -n kube-system | grep metric
pod/metrics-server-6b76bd68b6-f68f4    1/1     Running   68 (54m ago)   38d
service/metrics-server   ClusterIP   10.103.98.14   <none>        443/TCP                  38d
deployment.apps/metrics-server   1/1     1            1           38d
replicaset.apps/metrics-server-6b76bd68b6   1         1         1       38d

Also I was able to see node utilization with kubectl top node:

$ kubectl top node
NAME       CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%   
minikube   132m         1%     734Mi           4%

But I was not able to see the CPU utilization of the pod:

$ kubectl top pod
error: Metrics not available for pod default/spring-boot-k8s-556b578645-pwdhs, age: 747h9m12.416340419s

It turned out that there is a bug as discussed here and here.

The easy solution is to stop and start minikube with housekeeping interval argument:

$ minikube stop
$ minikube start --extra-config=kubelet.housekeeping-interval=10s

After this, I was able to get the pod CPU utilization:

$ kubectl top pod
NAME                               CPU(cores)   MEMORY(bytes)   
spring-boot-k8s-556b578645-pwdhs   2m           97Mi

Also this enabled k8s hpa to autoscale the pod on increased load. One thing to note is that, it might take some time to actually reflect the increased number of pod replica in the kubectl get pod command output.