kubernetesamazon-ekshpa

How HPA works in EKS cluster


we are using EKS cluster and Helm chart for deployment. Helm charts contains the template YANL files for CPU and Memory. here is the YANL file

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: hpa-{{ .Release.Name }}-memory
  namespace: {{ .Release.Namespace }}
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: {{ .Release.Name }}
  minReplicas: 2
  maxReplicas: 4
  metrics:
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 200
  behavior:
    scaleDown:
      policies:
      - periodSeconds: 60
        type: Pods
        value: 1
      - periodSeconds: 30
        type: Percent
        value: 30
      selectPolicy: Min
      stabilizationWindowSeconds: 120
    scaleUp:
      policies:
      - periodSeconds: 30
        type: Pods
        value: 1
      - periodSeconds: 30
        type: Percent
        value: 200
      selectPolicy: Max
      stabilizationWindowSeconds: 60

when we run the kubectl get hpa command

root@SX-In:~# kubectl get hpa -A
NAMESPACE         NAME                                        REFERENCE                                   TARGETS     MINPODS   MAXPODS   REPLICAS   AGE
test1   hpa-abc-cpu      Deployment/abc   2%/200%     2         4         2          23d
test2   hpa-abc-memory   Deployment/abc   183%/200%   2         4    

kubectl top pods shows below

root@SX-In:~# kubectl top pods -n test1
NAME                          CPU(CORES)    MEMORY(BYTES)
abcp1-78bb67d47f-lkwwr         5m           1148Mi
abcp1-78bb67d47f-vd26b         5m           917Mi

I read the docs but did not understand how this 183% in memory gets calculated . can someone please explain.

I tried searching formula for same but did not get

The machine size where pods are deployed is t2.xlarge(4 CPU , 16 GB RAM)

Please suggest


Solution

  • The percentage is the ratio of the Pod's actual memory usage to its resource requests. From the Kubernetes HPA documentation:

    For per-pod resource metrics (like CPU), the controller fetches the metrics from the resource metrics API for each Pod targeted by the HorizontalPodAutoscaler. Then, if a target utilization value is set, the controller calculates the utilization value as a percentage of the equivalent resource request on the containers in each Pod.

    I can't exactly reproduce the 183% number, but I can come close. Let's say your Deployment specifies that your Pods request 512 MiB of memory, with a hard limit of 2048 MiB. The actual memory (1148, 917 MiB) gets divided by the resource request to get a percentage (224%, 179%), and then those percentages get averaged across the Pods (202% in this specific calculation).

    That percentage then gets fed into the HPA formula to compute the new target replicas: value.

    Note that it looks like you have two HPAs trying to manage the same Deployment. This could be problematic in an example like what you show, where you have relatively high memory but low CPU: the memory autoscaler could want to scale up, but the CPU autoscaler would want to scale down to its minimum. You can attach multiple metrics to a single autoscaler and that would be a better setup.