I don't quite understand what is factor ans schema for Is there any reading on these theme? https://grafana.com/docs/mimir/latest/send/native-histograms/#bucket-boundary-calculation
I cannot explain what Factor
means without providing the relationship between Factor
and schema
relationship:
The Schema
refers to a specific index or set of indices that influence the boundaries of histogram buckets. Think of it has a value which determines the resolution or granularity at which your data is grouped into buckets
. Simply put a schema might directly specify an index value e.g (-4, -3, 0, 2, 4) which is then mapped to your data (usually distributed in different bucket ranges.)
I don't quite understand what is factor ans schema for Is there any reading on these theme?
The factor is part of the equation usually used to determine the spacing or width of the histogram buckets.
Let us say for example, we have a given factor (e.g., 1.1), the schema is chosen as the largest number from the list[-4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7, 8]
that satisfies a certain condition to ensure appropriate bucket sizes.
This allows for flexible histogram creation with exponentially growing bucket sizes as values move away from zero(i.e as you move from -ve bucket), which is useful for data spanning multiple orders of magnitude.
Furthermore, if your data is increasing exponentially a factor of 1.1
is applied to each data i.e each successive bucket's boundary might be 1.1 times larger than the previous one. This exponential growth helps represent a wide range of values without requiring a large number of buckets. It is basically a way of making sure you don't have more buckets than required.
In the simplest form
Factor is used to determine the appropriate scaling between consecutive bucket boundaries, based on an exponential growth rate(for example 2^schema
). The higher the factor, the faster the growth between bucket sizes.