I keep reading contradictory oppinions on how autoscaling works in Azure with regards to duration time and cooldown time.
Till now I have concluded that if the metric threshold is met after the cool down time then a new instance is immediately added (or removed), but Im might be wrong so please correct me.
Example:
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
| 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 |
| 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 |
| 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 |
| 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 |
| 51 | 52 | 53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 |
Duration = 10 minutes
Cool down = 5 minutes
Scale out threshold = 70% CPU usage
Starting with 1 instance (assumed)
1 instance is added in every scale out
Lets say that, since we started a machine the CPU usage is constatly > 70% for 60 mins:
Im sure that:
1 instance is added at the end of the duration time which is 10 mins
So, min|11| = 2 instances
Before a new instance will be added we have to wait for the 5 mins of cooldown time.
Azure keeps collecting metrics during the cooldown period.
This is where the confusion starts:
Scenario1:
1 instance is added at the end of the cooldown time which is 5 mins
(Since the criteria is met of this period | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 |)
So, min|16| = 3 instances(
min|11|(end of duration0) 2 instances ->
min|16|(end of duration0 + cooldown0) 3 instances
)
So, an instance will be added every 5 mins (taking into account that the CPU usage is constantly over 70%)
Next one will be added at min|22|, min|28|,....
Scenario2:
1 instance is added at the end of the cooldown time which is 5 mins + 10 mins of duration time which is 10 mins
So, min|26| = 3 instances(
min|11|(end of duration0) 2 instances ->
min|16|(end of duration0 + cooldown0) 2 instances ->
min|26|(end of duration0 + end of cooldown0 + end of duration01) 3 instances
)
So, an instance will be added every 15 mins (taking into account that the CPU usage is constantly over 70%)
Next one will be added at min|42|, min|58|,....
Please only reply if you are sure, there is enough confusion out there already. Real example with screenshots would be helpful
The correct behavior aligns with Scenario 2. After the initial scaling action, the system waits for the cooldown period to end before starting a new duration evaluation. Therefore, an instance is added every 15 minutes if the CPU usage consistently exceeds 70%.
For more information:
-https://learn.microsoft.com/en-us/azure/azure-monitor/autoscale/autoscale-understanding-settings
-https://learn.microsoft.com/en-us/azure/azure-monitor/autoscale/autoscale-best-practices
If above links are not of any help I would suggest raising a support request directly to Azure for further assistance as sometimes scaling can be tricky in terms of behaviour.