I have the following query that calculates the distance between two coordinates. However, I want to select the users that are within the range of n
km. Say, for example, I want to select all users that are within a distance of 100km. Therefore, I have to use HAVING
clause here. However, I see that the execution is damn too slow. Even with just two records it's returning the results very slowly. I wonder what would happen with a million user records (in future). Therefore, I am seeking an optimization to my current query for efficient and faster running.
SELECT *,
111.1111 *
DEGREES(ACOS(LEAST(1.0, COS(RADIANS(a.latitude))
* COS(RADIANS(b.latitude))
* COS(RADIANS(a.longitude) - RADIANS(b.longitude))
+ SIN(RADIANS(a.latitude))
* SIN(RADIANS(b.latitude))))) AS distance_in_km
FROM users AS a
JOIN users AS b ON a.id <> b.id
WHERE b.id != :user AND a.id = :user
HAVING distance_in_km < :maxdist
LIMIT 30
UPDATE
As suggested by Rick James, I have removed GROUP BY
clause and replaced it with AND a.id = :user
in WHERE
clause. This, as of now, is returning the same result as was with GROUP BY
.
The basic answer is that you cannot make your query more efficient. For your approach, you basically need to calculate the distance between all pairs of users, and that is expensive.
There may be some hacks that you can use. First, you probably don't need reversed pairs, so you can replace a.id <> b.id
with a.id < b.id
. That will eliminate half the work.
You can use a where
clause to pre-filter the rows. For instance, on most of the earth's surface, points that are more than 2 degrees of latitude or longitude apart are more than 100 km apart. This is not true everywhere. But it is probably good enough for you. That allows you to write:
where a.latitude between b.latitude - 2 and b.latitude + 2 and
a.longitude between b.longitude - 2 and b.longitude + 2
That will save much of the trigonometry, if your users are spread far and wide.
However, the real solution is to use the GIS extensions of MySQL. A place to start learning about that is in the documentation.
EDIT:
SELECT *,
111.1111 *
DEGREES(ACOS(LEAST(1.0, COS(RADIANS(a.latitude))
* COS(RADIANS(b.latitude))
* COS(RADIANS(a.longitude) - RADIANS(b.longitude))
+ SIN(RADIANS(a.latitude))
* SIN(RADIANS(b.latitude))))) AS distance_in_km
FROM users a JOIN
users b
ON a.id <> b.id
WHERE a.id = :user AND
a.latitude between b.latitude - 2 and b.latitude + 2 and
a.longitude between b.longitude - 2 and b.longitude + 2
HAVING distance_in_km < 100