The steps I use to set up a custom metrics HPA work on a standard GKE cluster but do not work on an Autopilot one.
I use an custom-metrics-stackdriver-adapter to implement an HPA based on the number of unacknowledged PubSub messages.
In both cases (Standard and Autopilot), I end up with the correct idle status situation with a number of running nodes corresponding to minReplicas.
However only the standard GKE correctly increases the number of pods up to maxReplica in case of traffic.
The only difference between the 2 piece of code is how I create the clusters. In the GKE standard:
gcloud container clusters create $CLUSTER_NAME \
--region=$REGION \
--project=$PROJECT_ID
In Autopilot GKE:
gcloud container clusters create-auto $CLUSTER_NAME \
--region=$REGION \
--project=$PROJECT_ID
Maybe this is due to the fact that autoscaling/v2beta2 is not compatible with Autopilot? Should I use autoscaling/v2 instead? What else could it be?
Due to the Workload limitations and restrictions in Autopilot we are unable to autoscale the replica sets in HPA. Check the logs for any errors and Paste it here. My best suggestion is to use Standard HPA.
Starting from v1.18 the v2beta2 API allows scaling behavior to be configured through the HPA behavior field.