I have a few tables that I join to produce a result data set I need (as shown below).
Query:
select distinct on (e."Col1", e."Col2")
s."SrcId",
s."CreatedTime",
s."ColM",
s."ColN",
m."ColX" as "m_ColX",
m."ColY" as "m_ColY",
n."ColZ1" as "n.ColZ1",
n."ColZ2" as "n.ColZ2",
e."Col1",
e."Col2"
from schema1."Table1" s
left join schema1."Table2" e on s."SrcId" = e.sid and date_trunc('second',s."CreatedTime")=date_trunc('second',e."ReportedTime")
left join schema1."Table3" m on m."Sid"=s."SrcId" and m."IsActive"=TRUE
left join schema1."Table4" n on n."Sid"=s."SrcId" and n."IsActive"=TRUE;
When I execute the query directly with a where clause (shown below), the results are almost instantaneous.
select distinct on (e."Col1", e."Col2")
s."SrcId",
s."CreatedTime",
s."ColM",
s."ColN",
m."ColX" as "m_ColX",
m."ColY" as "m_ColY",
n."ColZ1" as "n.ColZ1",
n."ColZ2" as "n.ColZ2",
e."Col1",
e."Col2"
from schema1."Table1" s
left join schema1."Table2" e on s."SrcId" = e.sid and date_trunc('second',s."CreatedTime")=date_trunc('second',e."ReportedTime")
left join schema1."Table3" m on m."Sid"=s."SrcId" and m."IsActive"=TRUE
left join schema1."Table4" n on n."Sid"=s."SrcId" and n."IsActive"=TRUE
where s."SrcId"=10 and s."CreatedTime" between '2024-10-10T00:00:00.000Z' and '2024-10-10T10:10:10.000Z'
But if I use the same query in a view, the results take way longer. Assume the view is called view1 for simplicity and I invoke it as below:
select * from schema1.view1 where "SrcId"=10 and "CreatedTime" between '2024-10-10T00:00:00.000Z' and '2024-10-10T10:10:10.000Z'
Can someone point out why this is the case and if there is anything I can do to improve/optimize calling the view so it results in a similar performance?
The first thought that comes to mind is the view might be taking time to correlate the where clause fields with the fields in the actual tables to figure out which ones to map them to. Could that be the culprit (or worse could that be skewing my results as well?) If so, what is a more efficient way to pass the where clause?
The queries are not equivalent:
in the query that runs fast for you, the WHERE
condition is executed before the DISTINCT
, and it can probably filter rows efficiently
in the slow query, the WHERE
condition is applied after the DISTINCT
, which prevents such filtering and may give you a different query result
My recommendation is to lay off the DISTINCT
in the view definition. If you really need it, write it into the query that uses the view.