sqlsql-serverjoinsql-server-2019

Select top 1 row from multiple rows in inner join


I am trying to prepare query to find top 1 row from multiple rows in join query.

I have 2 tables history table and new data table. In new data table always comes in pair but in some case data not come in pair so to find missing pairs, am preparing query.

Pair table looks like this:

enter image description here

And the history table looks like this:

Id  Tnum    Snum    Rnum
------------------------
1   A1234   F1       0
2   A1234   N1       0
3   B1234   SP       2
4   B1234   FW       2
5   A1234   F1       1
6   A1234   N1       1
7   C1234   I1       0
8   C1234   I2       0
9   A1234   F1       2
10  A1234   N1       2

New data table is here:

Id  Tnum    Snum    Rnum
-------------------------
1   A1234   F1       3
2   B1234   FW       3
3   C1234   I2       1

As per pair table, new data should be always in pair means if Snum is F1 then there should be also record for N1 for that Tnum and if Snum is FW then there should be also record for SP. In new table we can see records are not in pair so I have to find missing pairs.

How I have to check missing pairs : Check missing pair's previous entry in history table and take latest Rnum entry of that missing pair. For example : In data table we have Snum = F1 for Tnum A1234. And for this Tnum, N1 is not there so I have to take N1's highest Rnum entry for this Tnum from history table. For Snum = F1 and Tnum A1234, Missing pair entry would be Snum = N1 for Tnum A1234 and Rnum = 2.

Expected output of missing pair in new data table should look like this:

 History_Id Tnum    Snum    Rnum
 --------------------------------
  10        A1234    N1      2
  3         B1234    SP      2
  7         C1234    I1      0

I have prepared query as below but its not giving output as expected. Its returning only 1 record but It should return 3 records as expected output.

WITH previous_latest_missing_pair as 
(
    SELECT *
    FROM history_table
    WHERE id IN (SELECT MAX(h.id)
                 FROM history_table h 
                 JOIN new_data_table n ON n.Tnum = h.Tnum 
                 WHERE h.Snum = CASE
                                    WHEN n.Snum = 'SP' THEN 'FW'
                                    WHEN n.Snum = 'FW' THEN 'SP'
                                    WHEN n.Snum = 'F1' THEN 'N1'
                                    WHEN n.Snum = 'N1' THEN 'F1'
                                    WHEN n.Snum = 'I1' THEN 'I2'
                                    WHEN n.Snum = 'I2' THEN 'I1'
                                END)
)
SELECT h.*
FROM history_table h
JOIN previous_latest_missing_pair p ON p.Tnum = h.Tnum
                                    AND p.Snum = h.Snum
                                    AND p.Rnum = h.Rnum

Solution

  • Once you map the new data Snum value to the paired Snum value, you can use a CROSS APPLY (SELECT TOP 1 ... ORDER BY ...) pattern to select the best matching history row. For this use case, best is defined as the row with the highest Rnum value.

    with pairs as (
        select v.*
        from (
            values
                ('F1', 'N1'), ('N1', 'F1'),
                ('SP', 'FW'), ('FW', 'SP'),
                ('I1', 'I2'), ('I2', 'I1')
        ) v(First, Second)
    )
    select h1.*
    from new_data_table n
    join pairs p on p.First = n.Snum
    cross apply (
        select top 1 h.*
        from history_table h
        where h.Tnum = n.Tnum
        and h.Snum = p.Second
        order by h.Rnum desc
    ) h1
    

    Instead of a CASE expression, I've elected to encapsulate the pair data in a CTE, so that the primary query logic is more concise. You also might be able to generate the pairings from existing history data:

    with pairs as (
        select distinct h1.Snum AS First, h2.Snum AS Second
        from history_table h1
        join history_table h2
            on h2.Tnum = h1.Tnum
            and h2.Snum <> h1.Snum
    )
    select ...
    

    Another alternative is to load the pair data into a @table variable or #temp table. This does not require create table access on your current database .

    Results:

    Id Tnum Snum Rnum
    10 A1234 N1 2
    3 B1234 SP 2
    7 C1234 I1 0

    See this db<>fiddle for a demo.