c++algorithmsetpointsclosest-points

Understanding specific implementation of sweep line closest pair


First off, I was reading about the sweep-line algorithm to find closest pair of points in O(N lgN) time over at topcoder. I mostly understood the algorithm, however when I look at the implementation provided here (copied and made more readable below), I notice some striking differences.

#define x first
#define y second
typedef pair<long long, long long> pll;
   ...
set <pll> boundingBox;
boundingBox.insert(points[0]); //Points have been already sorted by x-coordinate
for (int i = 1; i < numPoints; i++)
{
    while ((left < i) && (points[i].x - points[left].x > shortestDistSoFar))
    {
        boundingBox.erase(points[left]);
        left++;
    }

    for (auto it = boundingBox.lower_bound(pll(points[i].y - shortestDistSoFar, points[i].x - shortestDistSoFar)); it != boundingBox.end() && it->y <= points[i].y + shortestDistSoFar; it++)
    {
        if (dist(*it, points[i]) < shortestDistSoFar)
        {
            shortestDistSoFar = dist(*it, points[i]);
            best1 = *it;
            best2 = points[i];
        }
    }
    boundingBox.insert(points[i]);
}

First, in the implementation snippet above, the std::set that holds the points and represents the bounding rectangle is not sorted by y-coordinate (instead by x-coordinate) which is contrary to what almost every other source says: The set is ordered by y coordinate. (Topcoder).

Next, even though the set is not sorted by y-coordinate, when the iterator is used to consider points in the active set ... whose y coordinates are in the range yN − h to yN + h, it is taken as the lower bound of pll(points[i].y - shortestDistSoFar, points[i].x - shortestDistSoFar). Why does the y come first? I would think that the correct order would be pll(points[i].x, points[i].y - shortestDistSoFar) but changing it to this breaks the algorithm.

Can someone please help address these seemingly inconsistent things?


Solution

  • In the original code, y coordinate is the first element of a pair. That's why points in a set are ordered correctly.