I am trying to understand how to go from n log^2 n time to n log n time for the closet pair algorithm. I get the below part (from http://www.cs.mcgill.ca/~cs251/ClosestPair/ClosestPairDQ.html)
Step 4 is a sort that takes O(n log n) time, which dominates all other steps and this is the one that needs to be reduced to O(n) in order for the overall algorithm to achieve O(n log n) time. And this is the part I am having a hard time understanding. The author proposes
Step 1: Divide the set into..., and recursively compute the distance in each part, returning the points in each set in sorted order by y-coordinate. Step 4: Merge the two sorted lists into one sorted list in O(n) time.
You still have to sort the points by the y-coordinate in the recursive step, which takes O(n log n) time. How can we avoid this? The merging is O(n), but we still have to sort somewhere.
The reason that O(n log n) is a problem is that we do it over and over and over again: if we partition the set into two subsets, and partition each of those into two subsets, then that involves seven sorts (one over the whole set, two over halves of it, four over quarters of it).
So the proposal fixes this by reusing the results of previous sorts: so we do full merge-sorts of the smallest partitions (each of which is O(1), totaling O(n)), but for larger partitions we just need to do a single O(n) merge pass to combine the results of those. So we only pay the O(n log n) price a total of once, which is fine.