I'm trying to create an Alert Policy based on the ratio of failed messages in a PubSub subscription. I like to use pubsub.googleapis.com/subscription/dead_letter_message_count as the numerator, and pubsub.googleapis.com/subscription/pull_ack_request_count as the denominator. The Alignment Periods match and I used a Cross Series Reducer to get rid of the additional label in the denominator by eliminating all labels. The Alert Policy I intend to create looks like:
monitoring/alertPolicy:AlertPolicy:
combiner : "AND"
conditions : [
[0]: {
conditionThreshold: {
aggregations : [
[0]: {
alignmentPeriod : "600s"
crossSeriesReducer: "REDUCE_SUM"
perSeriesAligner : "ALIGN_SUM"
}
]
comparison : "COMPARISON_GT"
denominatorAggregations: [
[0]: {
alignmentPeriod : "600s"
crossSeriesReducer: "REDUCE_SUM"
perSeriesAligner : "ALIGN_SUM"
}
]
denominatorFilter : "resource.type = \"pubsub_subscription\" AND resource.labels.subscription_id = \"subscription\" AND metric.type = \"pubsub.googleapis.com/subscription/pull_ack_request_count\""
duration : "1800s"
filter : "resource.type = \"pubsub_subscription\" AND resource.labels.subscription_id = \"subscription\" AND metric.type = \"pubsub.googleapis.com/subscription/dead_letter_message_count\""
thresholdValue : 0.5
}
}
]
But I'm getting the error:
Error creating AlertPolicy: googleapi: Error 400: The numerator is a delta metric but the denominator is not a delta metric.
Which looks confusing as both metrics are Delta. I used API explorer to make retrieve timeseries. For the numerator I get:
{
"timeSeries": [
{
"metric": {
"type": "pubsub.googleapis.com/subscription/dead_letter_message_count"
},
"resource": {
"type": "pubsub_subscription",
"labels": {
"project_id": "redacted"
}
},
"metricKind": "DELTA",
"valueType": "INT64",
"points": [
{
"interval": {
"startTime": "2023-03-13T10:10:00Z",
"endTime": "2023-03-13T10:20:00Z"
},
"value": {
"int64Value": "0"
}
},
....,
{
"interval": {
"startTime": "2023-03-13T09:10:00Z",
"endTime": "2023-03-13T09:20:00Z"
},
"value": {
"int64Value": "93"
}
},
{
"interval": {
"startTime": "2023-03-13T09:00:00Z",
"endTime": "2023-03-13T09:10:00Z"
},
"value": {
"int64Value": "9"
}
},
{
"interval": {
"startTime": "2023-03-13T08:50:00Z",
"endTime": "2023-03-13T09:00:00Z"
},
"value": {
"int64Value": "34"
}
}
]
}
],
"unit": "1"
}
And for the denominator:
{
"timeSeries": [
{
"metric": {
"type": "pubsub.googleapis.com/subscription/pull_ack_request_count"
},
"resource": {
"type": "pubsub_subscription",
"labels": {
"project_id": "redacted"
}
},
"metricKind": "DELTA",
"valueType": "INT64",
"points": [
{
"interval": {
"startTime": "2023-03-13T09:50:00Z",
"endTime": "2023-03-13T10:00:00Z"
},
"value": {
"int64Value": "6"
}
},
....,
{
"interval": {
"startTime": "2023-03-13T08:20:00Z",
"endTime": "2023-03-13T08:30:00Z"
},
"value": {
"int64Value": "104"
}
},
{
"interval": {
"startTime": "2023-03-13T08:10:00Z",
"endTime": "2023-03-13T08:20:00Z"
},
"value": {
"int64Value": "93"
}
},
{
"interval": {
"startTime": "2023-03-13T08:00:00Z",
"endTime": "2023-03-13T08:10:00Z"
},
"value": {
"int64Value": "111"
}
}
]
}
],
"unit": "1"
}
This ratio-based alert cannot be defined in JSON-based alerts, due to some implementation details. From Google:
We got an update from the product team stating that the issue is due to the inconsistency in the delta field.Apparently the reason for this is that pull_ack_request_count has a delta windowing operation with an explicit window within its definition. This explicit window prevents the precomputation from being marked a delta.
Ratios are a Google-internal query feature. We lack confidence in the implementation and can't promise that bugs won't turn up. The advice has generally been to use MQL instead of denominator filters.