I am trying to improve performance of a complex stored procedure. The bottleneck is this giant query with over 30 joins (most of them left) and a ridiculous amount of criteria in the WHERE statement. In addition, the SELECT portion contains calls to multiple functions that do significant work as well.
Pseudo Example:
select fn_DoWork1(caseID, orderID) as cln1,
fn_DoWork2(caseID, orderID) as cln2,
... 20 other columns
BalanceDue
from tbl1
left join tbl2 on ...
...
left join tbl30 on ...
where
(tbl10.ArrivalDate between @foo1 and @foo2)
...
(@prmProcessingID = 0 OR tbl10.ProcessingID = @prmProcessingID)
You get the idea. I broke up the query into smaller portions, depositing data into temp tables along the way, so that each successive query has to work on a smaller subset. But at the end of the day, it still matches thousands of records, so performance didn't improve as much as I hoped.
My next thought was to only return 1 page of records at a time (20 records per page), since that is all the user needs to see. So I added OFFSET x ROWS
and FETCH NEXT 20 ROWS ONLY
and that fixed the performance problem.
However, the problem now is that I return 20 rows, but I have no idea how many rows match the criteria in total. And thus, I can't tell the user how many pages of data there are - e.g. I can't render the pager on the UI properly.
Is there a better way to approach this problem?
Is there a better way to approach this problem?
Counting the rows requires evaluating the JOINs and WHERE clause over the whole dataset, so it is almost expensive as running the query and storing the results.
Your choices are to run the full query and cache the results, or adopt a UX that doesn't display the total number of pages.