First off, I was reading about the sweep-line algorithm to find closest pair of points in O(N lgN) time over at topcoder. I mostly understood the algorithm, however when I look at the implementation provided here (copied and made more readable below), I notice some striking differences.
#define x first
#define y second
typedef pair<long long, long long> pll;
...
set <pll> boundingBox;
boundingBox.insert(points[0]); //Points have been already sorted by x-coordinate
for (int i = 1; i < numPoints; i++)
{
while ((left < i) && (points[i].x - points[left].x > shortestDistSoFar))
{
boundingBox.erase(points[left]);
left++;
}
for (auto it = boundingBox.lower_bound(pll(points[i].y - shortestDistSoFar, points[i].x - shortestDistSoFar)); it != boundingBox.end() && it->y <= points[i].y + shortestDistSoFar; it++)
{
if (dist(*it, points[i]) < shortestDistSoFar)
{
shortestDistSoFar = dist(*it, points[i]);
best1 = *it;
best2 = points[i];
}
}
boundingBox.insert(points[i]);
}
First, in the implementation snippet above, the std::set that holds the points and represents the bounding rectangle is not sorted by y-coordinate (instead by x-coordinate) which is contrary to what almost every other source says: The set is ordered by y coordinate. (Topcoder)
.
Next, even though the set is not sorted by y-coordinate, when the iterator is used to consider points in the active set ... whose y coordinates are in the range yN − h to yN + h
, it is taken as the lower bound of pll(points[i].y - shortestDistSoFar, points[i].x - shortestDistSoFar)
. Why does the y
come first? I would think that the correct order would be pll(points[i].x, points[i].y - shortestDistSoFar)
but changing it to this breaks the algorithm.
Can someone please help address these seemingly inconsistent things?
In the original code, y coordinate is the first element of a pair. That's why points in a set are ordered correctly.