arraysalgorithmdata-structuresbinary-treermq

What is the logic behind calculating minimum queries in static arrays?


I was reading about static array queries and this is what I found:

Minimum Queries: There is an O(nlogn) time preprocessing method after which we can answer any minimum query in O(1) time.

The idea is to precalculate all values of min(a, b) where b - a + 1 (the length of the range) is a power of two. The number of precalculated values is O(nlogn), because there are O(logn) range lengths that are powers of two.

The values can be calculated efficiently using the recursive formula:
min(a,b) = min(min(a, a + w - 1), min(a + w, b))
where b-a+1 is a power of two and w = (b - a + 1) / 2

What is meant by the part quoted above? Why do we calculate the minimum only for certain lengths?
What is the idea and the intuition behind it? What does the logic do?

It's kind of a hunch that it must be related to something about a binary tree because we're considering just the powers of two for the lengths.


Solution

  • This structure is referred to an RMQ, a Range Minimum Query. It achieves its O(1) queries by exploiting the associativity and commutativity of the min operation (that is, min(x,y) = min(y,x) and min(x,y,z) = min(x,min(y,z)). The other property that min has is that min(x,x) = x, and more importantly, min(x,y,z) = min(min(x,y),min(y,z))

    If you have all the mins for every subarray of length of every power of 2 (hence the n log n memory), you can compute the range min(l-r) by taking the min of the largest power of 2, starting at l, that doesn't overshoot r, with the min of the largest power of 2 ending at r that doesn't undershoot l. The idea here is as follows:

    arr=[a,b,c,d,e,f,g,h] We calculate our RMQ to have mins as follows:

    of length 1: [min(a), min(b), min(c), etc]

    of length 2: [min(a,b), min(b,c), min(c,d), min(d,e), etc]

    of length 4: [min(a,b,c,d}, min(b,c,d,e), min(c,d,e,f), etc]

    To take the min from 1 to 6, we want the range min of length 4 starting at 1 (since 8 would go past our right index) and take the min of that with the range min of length 4 ending at 6. So we take these queries from the array of length 4, and take the min of min(of length 4[1], of length 4[2]) and that's our answer.