When creating a cartesian product between two tables, is there any difference between CROSS APPLY
and OUTER APPLY
?
This may seem like a silly question given that without a relationship expressed between the tables, the right-hand table can't fail to satisfy the relation, but I'm respectful of what I don't know.
When I look at the execution plans with a simple test setup, they're identical [two index seeks feeding into Nested Loops (Inner Join)], but simple test setups can be deceptive.
Here's an example of what I mean (SQL Fiddle). The setup:
CREATE TABLE dbo.First (
Id INT IDENTITY(1, 1) PRIMARY KEY,
Name NVARCHAR(100)
);
GO
DECLARE @n INT = 1;
WHILE @n < 10000
BEGIN
INSERT INTO dbo.First (Name) VALUES ('First' + CONVERT(NVARCHAR(100), @n));
SET @n = @n + 1;
END
GO
CREATE INDEX IX__First__Name ON dbo.First(Name);
GO
CREATE TABLE dbo.Second (
Id INT IDENTITY(1, 1) PRIMARY KEY,
Name NVARCHAR(100)
);
GO
DECLARE @n INT = 1;
WHILE @n < 10000
BEGIN
INSERT INTO dbo.Second (Name) VALUES ('Second' + CONVERT(NVARCHAR(100), @n));
SET @n = @n + 1;
END
GO
CREATE INDEX IX__Second__Name ON dbo.Second(Name);
GO
Using CROSS APPLY
:
SELECT First.Id AS FirstId, Second.Id AS SecondId
FROM First
CROSS APPLY Second
WHERE First.Name IN ('First253', 'First3304')
AND Second.Name IN ('Second6543', 'Second517');
Using OUTER APPLY
:
SELECT First.Id AS FirstId, Second.Id AS SecondId
FROM First
OUTER APPLY Second -- <== Only change is here
WHERE First.Name IN ('First253', 'First3304')
AND Second.Name IN ('Second6543', 'Second517');
...both of which give me the expected four rows.
Plus various variations where either, or both, IN
clauses return no matches:
-- No match in First
SELECT First.Id AS FirstId, Second.Id AS SecondId
FROM First
CROSS APPLY Second
WHERE First.Name IN ('no match')
AND Second.Name IN ('Second6543', 'Second517');
SELECT First.Id AS FirstId, Second.Id AS SecondId
FROM First
OUTER APPLY Second
WHERE First.Name IN ('no match')
AND Second.Name IN ('Second6543', 'Second517');
-- No match in Second
SELECT First.Id AS FirstId, Second.Id AS SecondId
FROM First
CROSS APPLY Second
WHERE First.Name IN ('First253', 'First3304')
AND Second.Name IN ('no match');
SELECT First.Id AS FirstId, Second.Id AS SecondId
FROM First
OUTER APPLY Second
WHERE First.Name IN ('First253', 'First3304')
AND Second.Name IN ('no match');
-- No match in either
SELECT First.Id AS FirstId, Second.Id AS SecondId
FROM First
CROSS APPLY Second
WHERE First.Name IN ('no match')
AND Second.Name IN ('no match');
SELECT First.Id AS FirstId, Second.Id AS SecondId
FROM First
OUTER APPLY Second
WHERE First.Name IN ('no match')
AND Second.Name IN ('no match');
...all of which give me the expected zero rows.
The difference comes into play when applied table or table-valued function has no records:
SELECT First.Id AS FirstId, Second.Id AS SecondId
FROM First
OUTER APPLY (SELECT * FROM Second WHERE Second.Id = -1) Second
WHERE First.Name IN ('First253', 'First3304');
2 rows returned
SELECT First.Id AS FirstId, Second.Id AS SecondId
FROM First
CROSS APPLY (SELECT * FROM Second WHERE Second.Id = -1) Second
WHERE First.Name IN ('First253', 'First3304');
0 rows returned
In OP's own words:
Not the way you're doing it, because conceptually you're filtering with WHERE
after the APPLY
(although the plans show the engine optimizing by doing it first); but if you explicitly filter first and then APPLY
like this:
SELECT First.Id AS FirstId, FilteredSecond.Id AS SecondId
FROM First
CROSS APPLY (SELECT Id FROM Second WHERE Name IN ('xxx')) FilteredSecond
WHERE First.Name IN ('First253', 'First3304');
you'd see the difference because you'd get rows with NULLs with the OUTER
but no rows with the CROSS
.