kuberneteshorizontal-scalinghpa

Horizontal Pod Autoscaling and resource configuration calibration


I am trying to understand how hpa works but I have some concerns:

In case my service is set like this:

resources:
  limits:
   cpu: 500m
   memory: 1Gi
  requests:
   cpu: 250m
   memory: 512Mi

and I configure hpa in this way:

spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: test-service
  minReplicas: 3
  maxReplicas: 6
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50
   

Is it preventing my service to reach the limits (500m), right?
Is it better to configure by putting a higher value like 80%?

I have this doubt because with this configuration I see pods scaled to the maximum number even if they are using less cpu than limits:

NAME                                  CPU(cores)   MEMORY(bytes)   
test-service-76f8b8c894-2f944            189m         283Mi           
test-service-76f8b8c894-2ztt6            183m         278Mi           
test-service-76f8b8c894-4htzg            117m         233Mi           
test-service-76f8b8c894-5hxhv            142m         193Mi           
test-service-76f8b8c894-6bzbj            140m         200Mi           
test-service-76f8b8c894-6sj5m            149m         261Mi    

The amount of CPU used is less than the request configured in the definition of the service.

Moreover, I have seen that it has been discussed here as well but I didn't get the answer. Using Horizontal Pod Autoscaling along with resource requests and limits


Solution

  • Is it preventing my service to reach the limits (500m), right?

    No, hpa is not preventing it (althogh resources.limits is). What hpa does is starting new replicas when the average cpu utilization across all pods gets above 50% of requested cpu resources, i.e. above 125m.

    Is it better to configure by putting a higher value like 80%?

    Can't say, it is application specific.

    Horizontal autoscaling is pretty well described in the documentation.