I was adding metric alerts to monitor a web api hosted in AKS and was looking azurerm_monitor_scheduled_query_rules_alert
here. I could not tell the difference between the two thresholds. I am confused about the purpose and where each one applies ?
trigger {
operator = "GreaterThan"
threshold = 3
metric_trigger {
operator = "GreaterThan"
threshold = 1
metric_trigger_type = "Total"
metric_column = "operation_Name"
}
}
We tested with trial and error and found that metric_trigger's threshold maps to "minFailingPeriodsToAlert". AKA "how many times the threshold must be exceeded for the alert to fire".
We applied an alert with a trigger like this:
trigger {
operator = "GreaterThan"
threshold = 3
metric_trigger {
metric_trigger_type = "Total"
operator = "GreaterThanOrEqual"
threshold = 100
metric_column = "fileCount"
}
}
and it created this resource in Azure
"criteria": {
"allOf": [
{
"query": "customEvents | where parsedStatus != \"RUNNING\" and parsedStatus != \"SUCCESS\" ",
"timeAggregation": "Average",
"metricMeasureColumn": "AggregatedValue",
"dimensions": [
{
"name": "itemCount",
"operator": "Include",
"values": [
"*"
]
}
],
"operator": "GreaterThan",
"threshold": 3,
"failingPeriods": {
"numberOfEvaluationPeriods": null,
"minFailingPeriodsToAlert": 100
}
}
]
},
In the end we set metric_trigger's threshold to 0 and use the "regular" threshold for configuring our alert. I might use this setting if our application got bursts of traffic - it would let the server process the backlog of files without firing the alert immediately.