mysqlinnodbgeospatial

Mysql Select count with geospatial ST_Contains is very slow with multiple rows


I have a mysql query to get all places count from an area. If I query for only one id it's really quick, if I query for two ids or more then it's really slow.

Areas.geometry and Places.location are SPATIAL indexes.

There is only 3 rows ( all have complex geometry. The row 3 is the more complex ) in areas table and 3000 rows in stores. I build up a demo sql file to import if you want to test : geospatial-exemple.sql

Some exemple :

This query is running in 260ms :

    select  a.name, 
            (
            SELECT  count(*)
                FROM  places p
                WHERE  ST_Contains(a.geometry,p.location)
            ) as places_count
        FROM  areas a
        WHERE  a.id in (1) 

enter image description here


This query is running in 320ms :

    select  a.name, 
            (
            SELECT  count(*)
                FROM  places p
                WHERE  ST_Contains(a.geometry,p.location)
            ) as places_count
        FROM  areas a
        WHERE  a.id in (3) 

enter image description here


This query is running in 50s :

    select  a.name, 
            (
            SELECT  count(*)
                FROM  places p
                WHERE  ST_Contains(a.geometry,p.location)
            ) as places_count
        FROM  areas a
        WHERE  a.id in (1,3) 

enter image description here


I also tried to hardcode the areas.geometry in the query with the more complex MULTIPOLYGON

This query is running in 380ms :

    select  a.name, 
            (
            SELECT  count(*)
                FROM  places p
                WHERE  ST_Contains(ST_GeomFromText("MULTIPOLYGON((...))",
                                    4326,
                                    'axis-order=long-lat'),p.location)
            ) as places_count
        FROM  areas a
        WHERE  a.id in (1,3) 

enter image description here


So clearly it's faster to run multiple queries than only one and wait for a minutes. If somebody know if it's a mysql bug or if there is another way to do that ? Working with Join query give the same results.


Solution

  • According to John Powells answer here, there is an undocumented limitation for spatial indexes:

    For the Contains and Intersects functions to work properly, and for the index to be used, you need to have one of the geometries be a constant. This doesn't appear to be documented, although all the examples you will see with MySQL with Intersects/Contains work this way.

    So running multiple queries with one area each would indeed be faster.

    If you have the permissions to create functions, you can however use a workaround by running your subquery in a function, where areas.geometry will now act as a constant parameter for ST_Contains():

    CREATE FUNCTION fn_getplacescount(_targetarea GEOMETRY) 
    RETURNS INT READS SQL DATA
    RETURN (SELECT COUNT(*) FROM places p WHERE ST_Contains(_targetarea, p.location));
    

    Now

    SELECT a.name, fn_getplacescount(a.geometry) AS places_count 
    FROM areas a WHERE a.id in (1,3);
    

    would be similar to running each area separately, and should have a similar execution time as using two separate queries.