kubernetesgoogle-kubernetes-engineautoscaling

GKE autoscaler 'optimize-utilization'


Is there anyone who can explain what the 'optimize-utilization' setting for the GKE autoscaler specifically does different from the standard autoscaling. It claims to be more aggressive in downscaling but does that mean that it doesn't look at the pod disruption budget, does it have a different limit for max resource usage (50% for the standard way) or does it have a 1 minute limit before scaling down instead of the normal 10 minutes? It is all very vague to me and I want to know the consequences before turning it on.


Solution

  • From Cluster Autoscaler Documentation:

    optimize-utilization: Prioritize optimizing utilization over keeping spare resources in the cluster. When enabled, Cluster Autoscaler will scale down the cluster more aggressively: it can remove more nodes, and remove nodes faster. This profile has been optimized for use with batch workloads that are not sensitive to start-up latency. We do not currently recommend using this profile with serving workloads.

    Promoted Autoscaling Profiles to beta. Use with gcloud beta container clusters create or gcloud container clusters update: --autoscaling-profile=balanced (default) or --autoscaling-profile=optimize-utilization.

    At beta, products or features are ready for broader customer testing and use. Betas are often publicly announced. There are no SLAs or technical support obligations in a beta release unless otherwise specified in product terms or the terms of a particular beta program. The average beta phase lasts about six months.

    More references: