I have a subquery that runs without problems on MySQL and I wanted to use the same query on AWS-Redshift, but I am getting this error: [0A000] ERROR: This type of correlated subquery pattern is not supported yet
, this is the query:
SELECT
COALESCE(r.id, r2.id) AS region_id,
FROM
search s
LEFT JOIN
region r ON s.region_id = r.id
LEFT JOIN
region r2 ON r2.id = (SELECT id
FROM region AS r
WHERE (acos(sin(radians(s.latitude))
* sin(radians(r.latitude))
+ cos(radians(s.latitude))
* cos(radians(r.latitude))
* cos(radians(r.longitude) - radians(s.longitude))
) * 3959 < :dis
)
AND type IN (1)
ORDER BY
(
acos
(
sin(radians(s.latitude))
* sin(radians(r.latitude))
+ cos(radians(s.latitude))
* cos(radians(r.latitude))
* cos(radians(r.longitude) - radians(s.longitude))
)
* 3959
) ASC LIMIT 1)
WHERE
s.user IS NOT NULL
ORDER BY
s.date_created DESC;
So far, what I have found is that this part of the code is the problem:
( acos
(
sin(radians(s.latitude))
* sin(radians(r.latitude))
+ cos(radians(s.latitude))
* cos(radians(r.latitude))
* cos(radians(r.longitude) - radians(s.longitude))
) * 3959 < :dis
)
AND type IN (1)
ORDER BY
(
acos
(
sin(radians(s.latitude))
* sin(radians(r.latitude))
+ cos(radians(s.latitude))
* cos(radians(r.latitude))
* cos(radians(r.longitude) - radians(s.longitude))
)
* 3959
)
But I do not know how to do it without a subquery.
The problem is that the subquery in the ON clause needs to be re-evaluated for each join possibility. On a clustered db this is prohibitively expensive. So you need to flatten this out to an additional set of JOIN information that has all the needed information to join r and r2.
I can take a stab at this but I don't know your data and I'm just guessing on what is important in the query.
SELECT
COALESCE(r.id, c.region_id) AS region_id,
FROM
search s
LEFT JOIN
region r ON s.region_id = r.id
LEFT JOIN ( SELECT search_id, region_id
FROM ( SELECT s.id as search_id, x.id as region_id,
acos(sin(radians(t.latitude))
* sin(radians(x.latitude))
+ cos(radians(t.latitude))
* cos(radians(x.latitude))
* cos(radians(x.longitude) - radians(t.longitude))
) * 3959 as calc,
ROW_NUMBER() OVER (partition by id1 order by calc asc) as rn
FROM region AS x
CROSS JOIN search t
WHERE calc < :dis
AND type IN (1)
AND t.user IS NOT NULL)
WHERE rn = 1) c
ON c.search_id = s.id
WHERE
s.user IS NOT NULL
ORDER BY
s.date_created DESC;
I can't test this so hopefully this change gives you a place to start from.
Note the CROSS JOIN. This is a slightly less expensive replacement for the correlated subquery. Either way you are calculating a distance(?) for every row combination between region and search and then finding the smallest value.