sqlsql-serverjoin

SQL query creating duplicate in JOIN


Currently working on a query that will join 2 tables. Each table has a column called RecordId which maps into both tables.

When I run the following query.

declare @id INT = 103;
SELECT TOP (1000) [RecordId]
  ,[Status]
  ,[Reject_Reason]
  ,[Datetime]
FROM [LNCVWBlockDB].[dbo].[tbOperation140] WHERE RecordId = @id

SELECT TOP (1000) [RecordId]
  ,[Status]
  ,[Camera_Grade]
  ,[Lasermark]
  ,[Datetime]
FROM [LNCVWBlockDB].[dbo].[tbOperation130] WHERE RecordId = @id

My result are the following:

enter image description here

Which is correct because the record is in tbOperation140 and tbOperation130 twice, meaning that item has gone through both operation stations.

When I try to join both tables in order to make a view using the following query:

declare @id INT = 103;
SELECT * 
FROM tbOperation140
INNER JOIN tbOperation130
ON tbOperation140.RecordId = tbOperation130.RecordId
WHERE tbOperation140.RecordId = @id

I get the following result:

enter image description here

Which gives me 4 records where in reality it should only have 2 records.* Each record should be mapped to each other by column Datetime which is a datetime column.


Solution

  • what’s happening here. The duplication is caused by the one-to-many relationship between your tables

    DECLARE @id INT = 103;
    
    -- Use Common Table Expressions (CTEs) to rank the records in each table.
    -- The ranking is partitioned by RecordId and ordered by the Datetime.
    WITH RankedOp140 AS (
        SELECT 
            *,
            ROW_NUMBER() OVER (PARTITION BY RecordId ORDER BY Datetime ASC) AS rn
        FROM 
            tbOperation140
    ),
    RankedOp130 AS (
        SELECT 
            *,
            ROW_NUMBER() OVER (PARTITION BY RecordId ORDER BY Datetime ASC) AS rn
        FROM 
            tbOperation130
    )
    
    -- Join the two ranked sets of data on both the RecordId and the generated row number.
    SELECT 
        r140.RecordId,
        r140.Status AS Status_140,
        r140.Reject_Reason,
        r140.Datetime AS Datetime_140,
        r130.Status AS Status_130,
        r130.Camera_Grade,
        r130.Lasermark,
        r130.Datetime AS Datetime_130
    FROM 
        RankedOp140 AS r140
    INNER JOIN 
        RankedOp130 AS r130 
        ON r140.RecordId = r130.RecordId AND r140.rn = r130.rn -- The join now requires the rank to match
    WHERE 
        r140.RecordId = @id;