sqlsubqueryamazon-redshiftcorrelated-subquery

ERROR: This type of correlated subquery pattern is not supported yet


I have a subquery that runs without problems on MySQL and I wanted to use the same query on AWS-Redshift, but I am getting this error: [0A000] ERROR: This type of correlated subquery pattern is not supported yet, this is the query:

SELECT 
    COALESCE(r.id, r2.id) AS region_id,
FROM 
    search s
LEFT JOIN 
    region r ON s.region_id = r.id
LEFT JOIN 
    region r2 ON r2.id = (SELECT id
                          FROM region AS r
                          WHERE (acos(sin(radians(s.latitude))
                                      * sin(radians(r.latitude))
                                 + cos(radians(s.latitude))
                                    * cos(radians(r.latitude))
                                    * cos(radians(r.longitude) - radians(s.longitude))
                                ) * 3959 < :dis
                            )
                            AND type IN (1)
                            ORDER BY
                            (
                                acos
                                (
                                    sin(radians(s.latitude))
                                    * sin(radians(r.latitude))
                                    + cos(radians(s.latitude))
                                    * cos(radians(r.latitude))
                                    * cos(radians(r.longitude) - radians(s.longitude))
                                )
                                * 3959
                            ) ASC LIMIT 1)
WHERE 
    s.user IS NOT NULL
ORDER BY
    s.date_created DESC; 

So far, what I have found is that this part of the code is the problem:

                            ( acos
                                (
                                    sin(radians(s.latitude))
                                    * sin(radians(r.latitude))
                                    + cos(radians(s.latitude))
                                    * cos(radians(r.latitude))
                                    * cos(radians(r.longitude) - radians(s.longitude))
                                ) * 3959 < :dis
                            )
                            AND type IN (1)
                            ORDER BY
                            (
                                acos
                                (
                                    sin(radians(s.latitude))
                                    * sin(radians(r.latitude))
                                    + cos(radians(s.latitude))
                                    * cos(radians(r.latitude))
                                    * cos(radians(r.longitude) - radians(s.longitude))
                                )
                                * 3959
                            )

But I do not know how to do it without a subquery.


Solution

  • The problem is that the subquery in the ON clause needs to be re-evaluated for each join possibility. On a clustered db this is prohibitively expensive. So you need to flatten this out to an additional set of JOIN information that has all the needed information to join r and r2.

    I can take a stab at this but I don't know your data and I'm just guessing on what is important in the query.

    SELECT 
        COALESCE(r.id, c.region_id) AS region_id,
    FROM 
        search s
    LEFT JOIN 
        region r ON s.region_id = r.id
    LEFT JOIN ( SELECT search_id, region_id
                FROM ( SELECT s.id as search_id, x.id as region_id, 
                         acos(sin(radians(t.latitude))
                         * sin(radians(x.latitude))
                         + cos(radians(t.latitude))
                         * cos(radians(x.latitude))
                         * cos(radians(x.longitude) - radians(t.longitude))
                         ) * 3959 as calc,
                         ROW_NUMBER() OVER (partition by id1 order by calc asc) as rn
                       FROM region AS x
                       CROSS JOIN search t 
                       WHERE calc < :dis
                         AND type IN (1)
                         AND t.user IS NOT NULL)
                WHERE rn = 1) c 
    ON c.search_id = s.id
    WHERE 
        s.user IS NOT NULL
    ORDER BY
        s.date_created DESC; 
    

    I can't test this so hopefully this change gives you a place to start from.

    Note the CROSS JOIN. This is a slightly less expensive replacement for the correlated subquery. Either way you are calculating a distance(?) for every row combination between region and search and then finding the smallest value.