I have avoided RIGHT OUTER JOIN
, since the same can be achieved using LEFT OUTER JOIN
if you reorder the tables.
However, recently I have been working with the need to have large numbers of joins, and I often encounter a pattern where a series of INNER JOIN
s are LEFT JOIN
ed to a sub select which itself contains many INNER JOIN
s:
SELECT *
FROM Tab_1 INNER JOIN Tab_2 INNER JOIN Tab_3...
LEFT JOIN (SELECT *
FROM Tab_4 INNER JOIN Tab_5 INNER JOIN Tab_6....
)...
The script is hard to read. I often encounter sub sub selects. Some are correlated sub-selects and performance across the board is not good (probably not only because of the way the scripts are written).
I could of tidy it up in several ways, such as using common table expressions, views, staging tables etc, but a single RIGHT JOIN
could remove the need for the sub selects. In many cases, doing so would improve performance.
In the example below, is there a way to replicate the result given by the first two SELECT statements, but using only INNER
and LEFT
joins?
DECLARE @A TABLE (Id INT)
DECLARE @B TABLE (Id_A INT, Id_C INT)
DECLARE @C TABLE (Id INT)
INSERT @A VALUES (1),(2)
INSERT @B VALUES (1,10),(2,20),(1,20)
INSERT @C VALUES (10),(30)
-- Although we want to see all the rows in A, we only want to see rows in C that have a match in B, which must itself match A
SELECT A.Id, T.Id
FROM
@A AS A
LEFT JOIN ( SELECT *
FROM @B AS B
INNER JOIN @C AS C ON B.Id_C = C.Id) AS T ON A.Id = T.Id_A;
-- NB Right join as although B and C MUST match, we only want to see them if they also have a row in A - otherwise null.
SELECT A.Id, C.Id
FROM
@B AS B
INNER JOIN @C AS C ON B.Id_C = C.Id
RIGHT JOIN @A AS A ON B.Id_A = A.Id;
Would you rather see the long-winded sub-selects, or a RIGHT JOIN
, assuming decent comments in each case?
All the articles I have ever read have said pretty much what I think about RIGHT JOINS, that they are unecessary and confusing. Is this case strong enough to break the cultural aversion?
I have found an answer to this question in the old scripts that I was going through - I came across this syntax which performs the same function as the RIGHT JOIN
example, using LEFT JOIN
s (or at least I think it does - it certainly gives the correct results in the example):
DECLARE @A TABLE (Id INT)
DECLARE @B TABLE (Id_A INT, Id_C INT)
DECLARE @C TABLE (Id INT)
INSERT @A VALUES (1),(2)
INSERT @B VALUES (1,10),(2,20),(1,20)
INSERT @C VALUES (10),(30)
SELECT
A.Id, C.Id
FROM
@A AS A
LEFT JOIN @B AS B
INNER JOIN @C AS C
ON C.Id = B.Id_C
ON B.Id_A = A.Id
I don't know if there is a name for this pattern, which I have not seen before in other places of work, but it seems to work like a "nested" join, allowing the LEFT JOIN to preserve rows from the later INNER JOIN.
EDIT: I have done some more research and apparently this is an ANSI SQL syntax for nesting joins, but... it does not seem to be very popular!