"Dissecting SQL Server Execution Plans" from Grant Fritchey is helping me to see why certain queries are slow. However, I am stumped why this simple rewrite performs a lot faster.
This takes 21 seconds. It uses a derived table:
-- 21 secs
SELECT *
FROM Table1 AS o JOIN(
SELECT col1
FROM Table1
GROUP BY col1
HAVING COUNT( * ) > 1
) AS i ON ON i.col1= o.col1
This is 3 times faster but simply moves out the derived table to a temp table:
-- 7 secs
SELECT col1
INTO #doubles
FROM Table1
GROUP BY col1
HAVING COUNT( * ) > 1
SELECT *
FROM Table1 AS o JOIN #doubles AS i ON i.col1= o.col1
My question is why moving from a derived table to a temp table improves performance so much (not how to make it even faster)? And how can I diagnose this using the (graphical) execution plan?
Looking at the values for the first execution plan, it looks like it's statistics. You have an estimated number of rows at 800 and an actual of 1.2 million. I think you'll find that updating the statistics will change the way the first query's plan is generated.