javascript arrays algorithm array-algorithms

Merge overlapping intervals behavior

I'm attempting to solve the merge overlapping intervals problem and have the working solution below. However there's a piece of logic that im having problems understanding. Namely the part where the currentInterval[1] gets updated with the max between currentInterval[1] and nextInterval[1]. This piece of code also updates the last array in the mergedIntervals array and I don't fully understand how this is happening as i dont see how the currentInterval is linked with the last array in mergedIntervals. Could someone please explain how mergedIntervals is being updated when we set currentInterval[1]?

 const array = [
      [1, 2],
      [4, 7],
      [9, 10],
      [3, 5],
      [6, 8],
    ];

    function mergeOverlappingIntervals(array) {
      let sortedIntervals = array.sort(function (a, b) {
        return a[0] - b[0];
      });
    
      let mergedIntervals = [];
      let currentInterval = sortedIntervals[0];
      mergedIntervals.push(currentInterval);
    
      for (let nextInterval of sortedIntervals) {
        if (currentInterval[1] >= nextInterval[0]) {
          currentInterval[1] = Math.max(
            currentInterval[1],
            nextInterval[1],
          );
        } else {
          currentInterval = nextInterval;
          mergedIntervals.push(nextInterval);
        }
      }
    
      return mergedIntervals;
    }

const result = mergeOverlappingIntervals(array);
console.log('result', result);

Solution

I guess that one of the insights you could need for understanding this algorithm is the following:

Even when an interval has already been pushed on the result array, it can still be modified.

And this is what happens. currentInterval is always an interval that is already pushed on the result array. This happens already before the loop starts.

But then later on, this interval currentInterval is possibly mutated (when the if condition is true). More precisely, it may get extended. Such an extension is effective in the result array, even though mergedIntervals is nowhere mentioned in the if block, it indirectly does get mutated, because currentInterval is its member.

Let's illustrate this with a simple run of this algorithm.

Input: [[1, 5], [3, 7]]

Just before the loop starts we have this situation:

mergedIntervals
┌─────────┬───────────────────────┐
│ index:  │         0             │
├─────────┼───────────────────────┤
│ content:│ currentInterval       │
│         │ ┌─────────┬───┬───┐   │
│         │ │ index:  │ 0 │ 1 │   │
│         │ ├─────────┼───┼───┤   │
│         │ │ content:│ 1 │ 5 │   │
│         │ └─────────┴───┴───┘   │
└─────────┴───────────────────────┘

This visualisation shows the outer array with one slot (index 0), whose content is an inner array with two slots (indices 0 and 1). The key here, is that the variable currentInterval references an array that is sitting inside mergedIntervals.

Now, when the loop makes its first iteration, it really is a useless iteration, because it lets nextInterval be the first interval from the input intervals, which is the same interval as currentInterval. The if condition is true, but that Math.max expression will be the same value that currentInterval[1] already has: 5. So it is a useless iteration, but it doesn't do any harm either.

The second iteration is the interesting one. Here nextInterval is the second interval in the input: [3, 7]. The if condition is true (because 5 >= 3), and so currentInterval[1] gets the value of nextInterval[1], which is 7. This effectively extends the current interval, and we get this:

mergedIntervals
┌─────────┬───────────────────────┐
│ index:  │         0             │
├─────────┼───────────────────────┤
│ content:│ currentInterval       │
│         │ ┌─────────┬───┬───┐   │
│         │ │ index:  │ 0 │ 1 │   │
│         │ ├─────────┼───┼───┤   │
│         │ │ content:│ 1 │ 7 │   │
│         │ └─────────┴───┴───┘   │
└─────────┴───────────────────────┘

Note how this affects the result array!

If the second interval would have been smaller such that it would fall completely inside currentInterval (e.g. [3, 4]), nothing would change, as then the Math.max expression would select currentInterval[1]. That would indeed be the right thing to do.

In this simple example the loop ends, and the output is indeed what we would expect to get.

I hope this clarifies the algorithm.