[SOLVED] Range minimum query, dynamic array, interval tree, treap

Range minimum query, dynamic array, interval tree, treap

I need an algorithm with some data structure in Python that at every step when two new elements e1, e2 are given:

finds the insertion positions (conserving the order) of the first and the second given elements.
finds the maximum value of the elements in the interval between the two insertion positions.
inserts at the previously found insertion position of the second given element, the second given element paired with the maximum value found in the interval plus a constant. Unless the second given element was already there, in which case we only need to update its value if the new value is greater.

And this step must be done in no more than logarithmic time, because when this step is repeated N times the overall worst-case time complexity cannot be far from O(NlogN).

-- For example: my_list = [(2,1), (4,3), (5,7), (9,1)]

As we see, the element 2 is paired with its assigned value 1, the element 4 is paired with value 3, 5 is paired with value 7, and 9 with value 1. And my_list is ordered by the first elements of the pairs.

Now, two elements are given, e1 = 3, e2 = 6.

The insertion position in my_list of (e1, ) == (3, ) is index 1, and the insertion position of (6, ) is index 3.

The maximum value found in the elements of my_list between indices 1 and 3 is value 7, because the maximum value of (4,3), (5,7) is 7.

Imagining that the constant to add is 1 we have: maximum value found + constant == 7 + 1 == 8. And we had e2 == 6, so the pair to insert is (6, 8) at the index 3.

And at the end of this step my_list must be: [(2,1), (4,3), (5,7), (6,8), (9,1)]

-- This question linked here is very similar but differs with my question on the index where the element is inserted. In that question the element is added at the end (appended), in my case, the insertion must be done in a way that preserves the order of the elements such that the start and end of the next arbitrary interval can be found in logarithmic time. This is why I think that, besides using a range minimum query, I will also need to use some advanced data structure like an interval tree or a treap.

Solution

For this type of job, I usually use an augmented B+ tree. See here for what a B+ tree is: https://en.wikipedia.org/wiki/B%2B_tree

Starting with a B+ tree, I would augment all of the internal nodes so that along with each pointer to a child node, it stores the maximum value associated with all keys in the subtree rooted at that child node.

That extra information makes it easy to calculate the maximum value for any interval in O(log N), and it's easy to maintain as you insert, delete, and modify items in the tree without changing the O(log N) complexity of those operations.

For an in-memory B+ tree, a maximum of 8 or so keys per node is usually performant.