Aggregate promql histogram for requests duration across all paths for a multi-node service

Our service has multiple nodes, and each node is producing a request_duration_histogram for each path it serves request on, so that our scrape looks something like this (just using a couple of paths, nodes and three buckets, but the real implementation has a lot more):

request_duration_histogram_bucket{le="Infinity", path="path1", source="source1"}
request_duration_histogram_bucket{le="Infinity", path="path2", source="source1"}
request_duration_histogram_bucket{le="1", path="path1", source="source1"}
request_duration_histogram_bucket{le="1", path="path2", source="source1"}
request_duration_histogram_bucket{le="0.1", path="path1", source="source1"}
request_duration_histogram_bucket{le="0.1", path="path2", source="source1"}
request_duration_histogram_bucket{le="Infinity", path="path1", source="source2"}
request_duration_histogram_bucket{le="Infinity", path="path2", source="source2"}
request_duration_histogram_bucket{le="1", path="path1", source="source2"}
request_duration_histogram_bucket{le="1", path="path2", source="source2"}
request_duration_histogram_bucket{le="0.1", path="path1", source="source2"}
request_duration_histogram_bucket{le="0.1", path="path2", source="source2"}

request_duration_histogram_count{path="path1", source="source1"}
request_duration_histogram_sum{path="path1", source="source1"}
request_duration_histogram_count{path="path2", source="source1"}
request_duration_histogram_sum{path="path2", source="source1"}

request_duration_histogram_count{path="path1", source="source2"}
request_duration_histogram_sum{path="path1", source="source2"}
request_duration_histogram_count{path="path2", source="source2"}
request_duration_histogram_sum{path="path2", source="source2"}

Now, we're trying to calculate in grafana, the 0.95 quantile (we're aware of the limitations of the quantile calculations starting from a histogram) of the service as a whole (so aggregating both path and sources) with the following query:

histogram_quantile(
    0.95, 
    sum(
        rate(
            request_duration_histogram_bucket[$__rate_interval]
        )
    ) by (le)
)

but we're getting the following warning: PromQL info: input to histogram_quantile needed to be fixed for monotonicity (and may give inaccurate results) for metric name ""

Looking at the docs my understanding is that the message is warning us against the fact that by aggregating across nodes and paths, we're breaking a requirement of the function input.

The rate for a path can decrease between two scrapes can decrease, and this breaks monotonicity between the vector passed to the histogram_quantile?

This seems to be a possibility for pretty much any metric that is being rated, so I'm not sure if the message is overzealous or we cannot actually use histograms for this type of query.

Another couple of notes:

using sum(increase(...)) makes the message go away
we cannot add the path or source to the by (..) function, since we want to see the quantile across ALL the paths and nodes

Should we revert to using a Summary with quantiles for this (even if they need to be aggregated server-side in grafana)?

Thank you!

Solution

I spent some more time looking at the issue, and my understanding of the problem seemed to be incorrect.

Based on this bugfix in the prometheus project, the warning message happens when buckets for the same target, within the same scrape happen to have non-decreasing values.

We analyzed the values for the query reporting issues, and found the following (still simplified from the real usecase):

Timestamp      le  source   path  value
17224422000000 .1  source1  path1 0.04166666666666667
17224422000000 .25 source1  path1 0.041666666666666664
17224422000000 .5  source1  path1 0.04166666666666667

My understanding is now that the issue is reported because the bucket .25 has a smaller value than bucket .1 and that is something that should have been fixed with the bugfix above

Any prometheus version ≥ 2.49.0 should not have this issue, and the query above seems to still be the correct way of aggregating across histograms.