kuberneteshpahorizontal-pod-autoscaling

How to make k8s cpu and memory HPA work together?


I'm using a k8s HPA template for CPU and memory like below:

---
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: {{.Chart.Name}}-cpu
  labels:
    app: {{.Chart.Name}}
    chart: {{.Chart.Name}}
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: {{.Chart.Name}}
  minReplicas: {{.Values.hpa.min}}
  maxReplicas: {{.Values.hpa.max}}
  targetCPUUtilizationPercentage: {{.Values.hpa.cpu}}
---
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: {{.Chart.Name}}-mem
  labels:
    app: {{.Chart.Name}}
    chart: {{.Chart.Name}}
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: {{.Chart.Name}}
  minReplicas: {{.Values.hpa.min}}
  maxReplicas: {{.Values.hpa.max}}
  metrics:
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageValue: {{.Values.hpa.mem}}

Having two different HPA is causing any new pods spun up for triggering memory HPA limit to be immediately terminated by CPU HPA as the pods' CPU usage is below the scale down trigger for CPU. It always terminates the newest pod spun up, which keeps the older pods around and triggers the memory HPA again, causing an infinite loop. Is there a way to instruct CPU HPA to terminate pods with higher usage rather than nascent pods every time?


Solution

  • As per the suggestion in comments, using a single HPA solved my issue. I just had to move CPU HPA to same apiVersion as memory HPA.