mysqlgeospatialspatial-query

Optimize ST_Distance_Sphere query


Geographical Distance Calculation Optimization

I am currently working on a geographical data processing task involving the calculation of distances between a specified geographical point and entries in an address table. The objective is to compile a list of addresses where the calculated distance falls within a predefined range.

The address table under consideration contains the following fields:

Street City Zip Code Country Lat Lon
StreetName1 CityName1 12345 Country1 12.345678 -45.67890
StreetName2 CityName2 23456 Country2 23.456789 -56.78901
StreetName3 CityName3 34567 Country3 34.567890 -67.89012

To perform distance calculations, I have implemented the ST_Distance_Sphere function. In addition, I have created a database view based on this address table and executed queries against the view. Here is an illustrative example of the query:

SELECT *
FROM address_view
WHERE ST_Distance_Sphere(point(Lat, Lon), point(:given_lat, :given_lon)) < :specified_range;

However, I am facing concerns regarding the performance of this process. Calculating the distance for all records in the address table is resource-intensive. Therefore, I am seeking recommendations and strategies to optimize the efficiency of this operation. Specifically, I am interested in query optimization techniques or indexing solutions that would enhance the performance while ensuring the accurate retrieval of addresses falling within the specified distance range.

I seek guidance on efficiently using the ST_Distance_Sphere function to calculate distances between a geographical point and addresses within a view. My goal is to retrieve addresses that meet specific distance criteria while maintaining performance standards.

Any insights on query optimization or indexing strategies are appreciated.


Solution

  • Fetching all the rows of the table is the performance burden.

    There are several ways to take advantage of "within the specified distance range". By using a "bounding box", there are multiple solutions using INDEX, SPATIAL, or even PARTITION. See Find Nearest

    If you continue with POINT a slight improvement is to precompute the "point" for each row from the lat/lng and keep such in a separate column.