sqlsql-serverjoin

How to join SQL tables with multiple joins with cross-table criteria?


Consider the following Tables

Table A

ID etc.
1 ...
2 ...

Table B

A_ID NAME etc.
1 A ...
1 B ...

Table C

A_ID NAME etc.
1 A ...
1 C ...

Where etc. denotes some unrelated columns that need to be selected as well.

How do I join these tables such that I get the following result? (NAME field is not a dimensional value with a common table)

A_ID B_NAME C_NAME etc. B etc. C
1 A A ... ...
1 B null ... null
1 null C null ...

I tried:

SELECT a.ID as 'A_ID', b.NAME as 'B_NAME', c.NAME as 'C_NAME', etc.
FROM A a
FULL JOIN B b on a.ID = b.A_ID
FULL JOIN C c on a.ID = c.C_ID and (c.NAME = b.NAME or b.NAME is null)

This produces the following table:

A_ID B_NAME C_NAME etc. B etc. C
1 A A ... ...
1 B null ... null
null null C null ...

Note that A_ID is null where B does not have a matching NAME.
Why is this so hard? It seems like I'm missing something obvious, but I can't find any questions joining this way.


Solution

  • You need to FULL JOIN the B with C separately, and then join it back to the main table:

    ;WITH TableA AS (
        SELECT  *
        FROM    (
            VALUES(1, 'Test', 'Zest')
            ) TableA(ID, Data, Data2)
    )
    ,TableB AS (
        SELECT  *
        FROM    (
            VALUES  (1, 'A', 'Data1', 'Data2')
            ,   (1, 'B', 'Data2', 'Data2')
            ) x (A_ID, Name, Data1, Data2)
        )
    ,TableC AS (
        SELECT  *
        FROM    (
            VALUES  (1, 'A', 'DataC')
            ,   (1, 'C', 'DataD')
        ) x (A_ID, Name, SomeOtherData1)
    )
    SELECT  *
    FROM    (
        SELECT  ISNULL(b.A_ID, c.A_ID) AS ID
        ,   b.Name AS BName
        ,   c.Name AS CName
        ,   b.Data1 AS BData1
        ,   c.SomeOtherData1 AS CSomeOtherData1
        FROM    TableB b
        FULL JOIN TableC c
            ON  c.A_ID = b.A_ID
            AND c.Name = b.Name
        ) x
    INNER JOIN TableA ta
        ON  ta.ID = x.ID
    

    Outputs:

    ID BName CName BData1 CSomeOtherData1 ID Data Data2
    1 A A Data1 DataC 1 Test Zest
    1 NULL C NULL DataD 1 Test Zest
    1 B NULL Data2 NULL 1 Test Zest

    It's then relatively easy to add more tables to the mix, but it's kinda tedious to get the related Data-columns, because one need to specify them inside the subquery.

    You could also just gather the names, and then LEFT JOIN back on B / C outside of the FULL JOIN subquery, it will be slightly slower though.

    Probably your data model is not correct, but hard to say from this example. I'm not sure what "Name is not dimension means", but if it's not unique, then it should be reflected in your sample data.