sqlduplicatesnot-exists

How can I add 'DOES NOT EXIST' to my SQL query?


I need to insert a new row from temp table (temp) if it does not exist in table (t). The problem I have is that there is not a primary key and the temp table is truncated each time its imported. I am merging the data from temp to t with a stored procedure but I only want to import rows that do not exist. I have tried at the start of the insert query:

IF NOT EXISTS (SELECT * 
               FROM FPS.Predictive_Acquisition_Cost_NEW sa 
               LEFT JOIN FPS_LZ.FPS.Predictive_Acquisition_Cost lz 
                         ON lz.Drug_Identifier = sa.Drug_Identifier
               WHERE sa.Drug_Identifier = lz.Drug_Identifier 
                 AND lz.Effective_Date <> sa.Effective_Date)

and at the end of the insert query:

WHERE NOT EXISTS(SELECT * FROM FPS.Predictive_Acquisition_Cost_NEW)

Full insert query for better reference:

--IF NOT EXISTS (SELECT * FROM FPS.Predictive_Acquisition_Cost_NEW sa LEFT JOIN FPS_LZ.FPS.Predictive_Acquisition_Cost lz ON lz.Drug_Identifier = sa.Drug_Identifier WHERE sa.Drug_Identifier = lz.Drug_Identifier AND lz.Effective_Date <> sa.Effective_Date)

    INSERT INTO FPS_SA.FPS.Predictive_Acquisition_Cost_NEW
    (
    Drug_Identifier,
    Identifier_Type,
    Drug_Group,
    Brand_Generic,
    PAC,
    PAC_Low,
    PAC_High,
    PAC_Retail,
    Error_Code,
    Effective_Date,
    End_Date,
    PAC_Model_Version
   
    )
    SELECT lz.Drug_Identifier,
       lz.Identifier_Type,
       lz.Drug_Group,
       lz.Brand_Generic,
       lz.PAC,
       lz.PAC_Low,
       lz.PAC_High,
       lz.PAC_Retail,
       lz.Error_Code,
       lz.Effective_Date,
       lz.End_Date,
       lz.PAC_Model_Version
      
    FROM FPS_LZ.fps.Predictive_Acquisition_Cost lz LEFT OUTER JOIN          FPS_SA.FPS.Predictive_Acquisition_Cost sa ON lz.Drug_Identifier = sa.Drug_Identifier`
--WHERE NOT EXISTS(SELECT * FROM FPS.Predictive_Acquisition_Cost_NEW)

I tried an "IF NOT EXISTS" at the beginning of the query and/or a "WHERE NOT EXISTS" at the end of the query.

I keep getting duplicate rows in the destination table.


Solution

  • If there is no matching via a unique key you must decide which columns to include when comparing rows from each table. For example, the following would identify rows from the new table that do not exist in the existing table:

    SELECT
          t.*
    FROM FPS.Predictive_Acquisition_Cost_NEW t
    WHERE NOT EXISTS(SELECT NULL 
                     FROM FPS_LZ.fps.Predictive_Acquisition_Cost lz 
                     WHERE  lz.Drug_Identifier = t.Drug_Identifier
                        AND lz.Identifier_Type = t.Identifier_Type
                        AND lz.Drug_Group = t.Drug_Group
                        AND lz.Brand_Generic = t.Brand_Generic
                        AND lz.PAC = t.PAC
                        AND lz.PAC_Low = t.PAC_Low
                        AND lz.PAC_High = t.PAC_High
                        AND lz.PAC_Retail = t.PAC_Retail
                        AND lz.Error_Code = t.Error_Code
                        AND lz.Effective_Date = t.Effective_Date
                        AND lz.End_Date = t.End_Date
                        AND lz.PAC_Model_Version = t.PAC_Model_Version
                     )
    

    In the reverse, identify rows from the existing table that do not exist in the new table just adjust the table references:

    SELECT
          lz.*
    FROM FPS_LZ.fps.Predictive_Acquisition_Cost lz
    WHERE NOT EXISTS(SELECT NULL 
                     FROM FPS.Predictive_Acquisition_Cost_NEW t 
                     WHERE  lz.Drug_Identifier = t.Drug_Identifier
                        AND lz.Identifier_Type = t.Identifier_Type
                        AND lz.Drug_Group = t.Drug_Group
                        AND lz.Brand_Generic = t.Brand_Generic
                        AND lz.PAC = t.PAC
                        AND lz.PAC_Low = t.PAC_Low
                        AND lz.PAC_High = t.PAC_High
                        AND lz.PAC_Retail = t.PAC_Retail
                        AND lz.Error_Code = t.Error_Code
                        AND lz.Effective_Date = t.Effective_Date
                        AND lz.End_Date = t.End_Date
                        AND lz.PAC_Model_Version = t.PAC_Model_Version
                     )
    

    The main point is that to use NOT EXISTS(subquery) that subquery must contain a where clause with predicates that compare values from both tables. In particular note how each predicate in there where clause refers to lz (existing) and t (new).

    nb: "select null" or "select *" or "select 1" can be used in the subquery as it will make no difference to the outcome. The vital part of the subquery in a NOT EXISTS construct is the where clause.