google-cloud-pubsubstackdrivergoogle-cloud-monitoring

Cannot get ratio of metrics on GCP Monitoring: Error 400: the numerator is a delta metric but the denominator is not a delta metric


I'm trying to create an Alert Policy based on the ratio of failed messages in a PubSub subscription. I like to use pubsub.googleapis.com/subscription/dead_letter_message_count as the numerator, and pubsub.googleapis.com/subscription/pull_ack_request_count as the denominator. The Alignment Periods match and I used a Cross Series Reducer to get rid of the additional label in the denominator by eliminating all labels. The Alert Policy I intend to create looks like:

monitoring/alertPolicy:AlertPolicy:
        combiner   : "AND"
        conditions : [
            [0]: {
                conditionThreshold: {
                    aggregations           : [
                        [0]: {
                            alignmentPeriod   : "600s"
                            crossSeriesReducer: "REDUCE_SUM"
                            perSeriesAligner  : "ALIGN_SUM"
                        }
                    ]
                    comparison             : "COMPARISON_GT"
                    denominatorAggregations: [
                        [0]: {
                            alignmentPeriod   : "600s"
                            crossSeriesReducer: "REDUCE_SUM"
                            perSeriesAligner  : "ALIGN_SUM"
                        }
                    ]
                    denominatorFilter      : "resource.type = \"pubsub_subscription\" AND resource.labels.subscription_id = \"subscription\" AND metric.type = \"pubsub.googleapis.com/subscription/pull_ack_request_count\""
                    duration               : "1800s"
                    filter                 : "resource.type = \"pubsub_subscription\" AND resource.labels.subscription_id = \"subscription\" AND metric.type = \"pubsub.googleapis.com/subscription/dead_letter_message_count\""
                    thresholdValue         : 0.5
                }
            }
        ]

But I'm getting the error:

Error creating AlertPolicy: googleapi: Error 400: The numerator is a delta metric but the denominator is not a delta metric.

Which looks confusing as both metrics are Delta. I used API explorer to make retrieve timeseries. For the numerator I get:

{
  "timeSeries": [
    {
      "metric": {
        "type": "pubsub.googleapis.com/subscription/dead_letter_message_count"
      },
      "resource": {
        "type": "pubsub_subscription",
        "labels": {
          "project_id": "redacted"
        }
      },
      "metricKind": "DELTA",
      "valueType": "INT64",
      "points": [
        {
          "interval": {
            "startTime": "2023-03-13T10:10:00Z",
            "endTime": "2023-03-13T10:20:00Z"
          },
          "value": {
            "int64Value": "0"
          }
        },
        ....,
        {
          "interval": {
            "startTime": "2023-03-13T09:10:00Z",
            "endTime": "2023-03-13T09:20:00Z"
          },
          "value": {
            "int64Value": "93"
          }
        },
        {
          "interval": {
            "startTime": "2023-03-13T09:00:00Z",
            "endTime": "2023-03-13T09:10:00Z"
          },
          "value": {
            "int64Value": "9"
          }
        },
        {
          "interval": {
            "startTime": "2023-03-13T08:50:00Z",
            "endTime": "2023-03-13T09:00:00Z"
          },
          "value": {
            "int64Value": "34"
          }
        }
      ]
    }
  ],
  "unit": "1"
}

And for the denominator:

{
  "timeSeries": [
    {
      "metric": {
        "type": "pubsub.googleapis.com/subscription/pull_ack_request_count"
      },
      "resource": {
        "type": "pubsub_subscription",
        "labels": {
          "project_id": "redacted"
        }
      },
      "metricKind": "DELTA",
      "valueType": "INT64",
      "points": [
        {
          "interval": {
            "startTime": "2023-03-13T09:50:00Z",
            "endTime": "2023-03-13T10:00:00Z"
          },
          "value": {
            "int64Value": "6"
          }
        },
        ....,
        {
          "interval": {
            "startTime": "2023-03-13T08:20:00Z",
            "endTime": "2023-03-13T08:30:00Z"
          },
          "value": {
            "int64Value": "104"
          }
        },
        {
          "interval": {
            "startTime": "2023-03-13T08:10:00Z",
            "endTime": "2023-03-13T08:20:00Z"
          },
          "value": {
            "int64Value": "93"
          }
        },
        {
          "interval": {
            "startTime": "2023-03-13T08:00:00Z",
            "endTime": "2023-03-13T08:10:00Z"
          },
          "value": {
            "int64Value": "111"
          }
        }
      ]
    }
  ],
  "unit": "1"
}

Solution

  • This ratio-based alert cannot be defined in JSON-based alerts, due to some implementation details. From Google:

    We got an update from the product team stating that the issue is due to the inconsistency in the delta field.Apparently the reason for this is that pull_ack_request_count has a delta windowing operation with an explicit window within its definition. This explicit window prevents the precomputation from being marked a delta.

    Ratios are a Google-internal query feature. We lack confidence in the implementation and can't promise that bugs won't turn up. The advice has generally been to use MQL instead of denominator filters.