I'm running a case when exists
to try to filter out anamoly data from my table, but I am running into a unsupported correlated query
error.
I have a table of anomalies for different metrics and I want to null those metric values out in my final results if they're in this anomaly table.
I have a table, anomalytable
, which stores an id, the metric type of the anomaly, and the date range which this occurs.
| id | startdate | enddate | metrictype |
| ---- | --------- | ------- | -----------|
| 1 | 1/1/23 | 1/3/23 | metric1 |
| 2 | 1/3/23 | 1/5/23 | metric2 |
I have a results table where I want to null the metrics for if they belong in the above table.
| id | metric1 | metric2 | date |
| ---- | --------- | ------- | -------|
| 1 | 300 | 1.2 | 1/2/23 |
| 2 | 1.1 | 500 | 1/4/23 |
What I'm hoping my final results will look like (as you can see, the 300 and 500 anomaly numbers were nulled out)
| id | metric1 | metric2 | date |
| ---- | --------- | ------- | -------|
| 1 | null | 1.2 | 1/2/23 |
| 2 | 1.1 | null | 1/4/23 |
My query is below
select
case when exists
(select * from anomalytable b where a.id = b.id and a.date between b.startdate and b.enddate and b.metricname = 'metric1')
then null else a.metric1 end,
case when exists
(select * from anomalytable b where a.id = b.id and a.date between b.startdate and b.enddate and b.metricname = 'metric2')
then null else a.metric2 end
from resultstable a
However everytime I run this I get the correlated subquery pattern not supported
error. I've read through the redshift unsupported correlated queries and can't see what rule I'm breaking. Is it possible to rewrite this with joins in a clean fashion?
Use a left join instead. Simpler and likely faster.
select
r.id,
case when a.metricname = 'metric1' then null else r.metric1val end,
case when a.metricname = 'metric2' then null else r.metric2val end,
r.date
from resultstable r
left join anomalytable a on r.id = a.id and r.date between a.startdate and a.enddate