we are using EKS cluster and Helm chart for deployment. Helm charts contains the template YANL files for CPU and Memory. here is the YANL file
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: hpa-{{ .Release.Name }}-memory
namespace: {{ .Release.Namespace }}
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: {{ .Release.Name }}
minReplicas: 2
maxReplicas: 4
metrics:
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 200
behavior:
scaleDown:
policies:
- periodSeconds: 60
type: Pods
value: 1
- periodSeconds: 30
type: Percent
value: 30
selectPolicy: Min
stabilizationWindowSeconds: 120
scaleUp:
policies:
- periodSeconds: 30
type: Pods
value: 1
- periodSeconds: 30
type: Percent
value: 200
selectPolicy: Max
stabilizationWindowSeconds: 60
when we run the kubectl get hpa
command
root@SX-In:~# kubectl get hpa -A
NAMESPACE NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
test1 hpa-abc-cpu Deployment/abc 2%/200% 2 4 2 23d
test2 hpa-abc-memory Deployment/abc 183%/200% 2 4
kubectl top pods
shows below
root@SX-In:~# kubectl top pods -n test1
NAME CPU(CORES) MEMORY(BYTES)
abcp1-78bb67d47f-lkwwr 5m 1148Mi
abcp1-78bb67d47f-vd26b 5m 917Mi
I read the docs but did not understand how this 183% in memory gets calculated . can someone please explain.
I tried searching formula for same but did not get
The machine size where pods are deployed is t2.xlarge(4 CPU , 16 GB RAM)
Please suggest
The percentage is the ratio of the Pod's actual memory usage to its resource requests. From the Kubernetes HPA documentation:
For per-pod resource metrics (like CPU), the controller fetches the metrics from the resource metrics API for each Pod targeted by the HorizontalPodAutoscaler. Then, if a target utilization value is set, the controller calculates the utilization value as a percentage of the equivalent resource request on the containers in each Pod.
I can't exactly reproduce the 183% number, but I can come close. Let's say your Deployment specifies that your Pods request 512 MiB of memory, with a hard limit of 2048 MiB. The actual memory (1148, 917 MiB) gets divided by the resource request to get a percentage (224%, 179%), and then those percentages get averaged across the Pods (202% in this specific calculation).
That percentage then gets fed into the HPA formula to compute the new target replicas:
value.
Note that it looks like you have two HPAs trying to manage the same Deployment. This could be problematic in an example like what you show, where you have relatively high memory but low CPU: the memory autoscaler could want to scale up, but the CPU autoscaler would want to scale down to its minimum. You can attach multiple metrics to a single autoscaler and that would be a better setup.