Given a set of intervals: {1-4, 6-7, 10-12} add a new interval: (9,11) so that the final solution is 'merged': Output: {1-4, 6-7, 9-12}. The merger can happen on both sides (low as well as high range).
I saw this question was answered at multiple places, someone even suggested using Interval Tress, but did not explain how exactly they would use it. The only solution I know of is to arrange the intervals in ascending order of their start time and iterating over them and trying to merge them appropriately.
If someone can help me understand how we can use interval trees in this use case, that will be great!
[I have been following interval trees in CLRS book, but they do not talk about merging, all they talk about is insertion and search.]
(I'm assuming that this means that intervals can never overlap, since otherwise they'd be merged.)
One way to do this would be to store a balanced binary search tree with one node per endpoint of a range. Each node would then be marked as either an "open" node marking the start of an interval or a "close" node marking the end of an interval.
When inserting a new range, one of two cases will occur regarding the start point of the range:
To determine which case you're in, you can do a predecessor search in the tree for the range's start point. If you get NULL or a close node, you need to insert a new open node representing the start point of the range. If you get an open node, you will just keep extending that interval.
From there, you need to determine how far the range extends. To do this, continuously compute the successor of the initial node you inserted until one of the following occurs:
You have looked at all nodes in the tree. In that case, you need to insert a close node marking the end of this interval.
You see a close node after the end of the range. In that case, you're in the middle of an existing range when the new range ends, so you don't need to do anything more. You're done.
You see a close or open node before the end of the range. In that case, you need to remove that node from the tree, since the old range is subsumed by the new one.
You see an open node after the end of the range. In that case, insert a new close node into the tree, since you need to terminate the current range before seeing the start of this new one.
Implemented naively, the runtime of this algorithm is O(log n + k log n), where n is the number of intervals and k is the number of intervals removed during this process (since you have to do n deletes). However, you can speed this up to O(log n) by using the following trick. Since the deletion process always deletes nodes in a sequence, you can use a successor search for the endpoint to determine the end of the range to remove. Then, you can splice the subrange to remove out of the tree by doing two tree split operations and one tree join operation. On a suitable balanced tree (red-black or splay, for example), this can be done in O(log n) total time, which is much faster if a lot of ranges are going to get subsumed.
Hope this helps!