kubernetesgoogle-kubernetes-enginekubernetes-podautoscalinghorizontal-pod-autoscaling

Difference between HorizontalPodAutoscaler and GKE Cluster Autoscaler


I can automatically provision nodes in response to increased load either via Cluster Autoscaler, which I can create a cluster with:

gcloud container clusters create example-cluster \
  --num-nodes 1 \
  --zone us-central1-a \
  --node-locations us-central1-a,us-central1-b,us-central1-f \
  --enable-autoscaling --min-nodes 1 --max-nodes 4

Or via Horizontal Pod Autoscaling, which I can apply to an existing cluster with:

kubectl autoscale deployment <deployment-name> --min=1 --max=4

What is the difference between Cluster Autoscaler and Horizontal Pod Autoscaling? They seem like alternate approaches to the same goal (having infrastructure adjust dynamically in response to greater resource needs). For example, both allow a minimum and maximum node count to be specified. Yet the documentation for each makes no reference to the other.


Solution

  • Horizontal Pod autoscaler: Allow to update the numbers of replicas of a targeted deployment. Based on a metric you specify (eg: CPU...). This only affect the numbers of pods (not the numbers of nodes).

    Cluster autoscaler: Allow to add new nodes to a nodepool if pods are not schedulables due to a lack a ressources (eg: CPU, memory...). This only affect the numbers of nodes (not the numbers of pods).

    (Note that you can use both at the same Time)