sql-servert-sqlsql-mergescd2

MERGE attempted to UPDATE or DELETE the same row more than once


I am trying to load a standard Kimball SCD2 Dimension, using a merge statement which I got from the following website: http://www.kimballgroup.com/2008/11/design-tip-107-using-the-sql-merge-statement-for-slowly-changing-dimension-processing/

This merge statement is the same except to handle new entities. This will be handled as a direct insert in the dataflow. This problem concerns only multiple versions of the same business key.

When I execute the merge statement SQL returns the error:

Msg 8672, Level 16, State 1, Line 3
The MERGE statement attempted to UPDATE or DELETE the same row more than once. This happens when a target row matches more than one source row.
A MERGE statement cannot UPDATE/DELETE the same row of the target table multiple times. Refine the ON clause to ensure a target row matches at most one source row, or use the GROUP BY clause to group the source rows.

I am using SQL Server 2012:

SOURCE DATASET

enter image description here

TARGET DATASET

enter image description here

This is what I expected:

enter image description here

Below you can find the script to reproduce the problem:

CREATE TABLE SANDBOX.EHN.SOURCE_SCD2 (
  BUSINESS_KEY  BIGINT
 ,DESCRIPTION_A VARCHAR(2)
 ,M_CRC         BIGINT
 ,StartDATE     DATE
 ,EndDATE       DATE )



CREATE TABLE SANDBOX.EHN.TARGET_SCD2 (
  BUSINESS_KEY  BIGINT
 ,DESCRIPTION_A VARCHAR(2)
 ,M_CRC         BIGINT
 ,StartDATE     DATE
 ,EndDATE       DATE )



 select *
 from SANDBOX.EHN.TARGET_SCD2

 truncate table SANDBOX.EHN.TARGET_SCD2

 INSERT INTO SANDBOX.EHN.SOURCE_SCD2 VALUES (1, 'B',  1,   '2015-05-16', '2015-06-01')
 INSERT INTO SANDBOX.EHN.SOURCE_SCD2 VALUES (1, 'C',  2,   '2015-06-01', '2015-06-11')
 INSERT INTO SANDBOX.EHN.SOURCE_SCD2 VALUES (1, 'D',  3,   '2015-06-11', '9999-12-31')

 INSERT INTO SANDBOX.EHN.TARGET_SCD2 VALUES (1,  'A', 0,    '2015-01-16', '9999-12-31')



INSERT INTO SANDBOX.EHN.TARGET_SCD2 
     SELECT BUSINESS_KEY
          ,DESCRIPTION_A
         ,M_CRC
         ,StartDATE
         ,EndDATE
FROM (
MERGE SANDBOX.EHN.TARGET_SCD2 D
USING SANDBOX.EHN.SOURCE_SCD2 UPD
ON(D.BUSINESS_KEY = UPD.BUSINESS_KEY )
    WHEN MATCHED AND D.EndDATE = '9999-12-31' 
    THEN UPDATE SET  D.EndDATE =  UPD.EndDATE
OUTPUT $Action Action_Out, UPD.BUSINESS_KEY
                         , UPD.DESCRIPTION_A
                    , UPD.M_CRC
                    , UPD.StartDATE
                    , UPD.EndDATE
)AS MERGE_OUT
WHERE MERGE_OUT.Action_Out = 'UPDATE'

Can you help me to fix this problem?


Solution

  • For the last update only use;

    INSERT INTO SANDBOX.EHN.TARGET_SCD2 
         SELECT BUSINESS_KEY
              ,DESCRIPTION_A
             ,M_CRC
             ,StartDATE
             ,EndDATE
    FROM (
    MERGE SANDBOX.EHN.TARGET_SCD2 D
    USING SANDBOX.EHN.SOURCE_SCD2 UPD
    ON(D.BUSINESS_KEY = UPD.BUSINESS_KEY AND UPD.EndDATE = '9999-12-31')
        WHEN MATCHED AND D.EndDATE = '9999-12-31' 
        THEN UPDATE SET  D.EndDATE =  UPD.StartDATE
    OUTPUT $Action Action_Out, UPD.BUSINESS_KEY
                             , UPD.DESCRIPTION_A
                             , UPD.M_CRC
                             , UPD.StartDATE
                             , UPD.EndDATE
    )AS MERGE_OUT
    WHERE MERGE_OUT.Action_Out = 'UPDATE'
    

    If you want ALL SRC rows in your target table then I agree with Nick.McDermaid

    For ALL rows use;

    UPDATE TRG
      SET TRG.EndDate = SRC.StartDATE
    FROM SANDBOX.EHN.TARGET_SCD2 TRG
    JOIN  ( select SRC.BUSINESS_KEY, min(src.StartDATE)StartDATE
            from SANDBOX.EHN.SOURCE_SCD2 SRC
              group by SRC.BUSINESS_KEY
           )SRC
    on  ( TRG.BUSINESS_KEY = SRC.BUSINESS_KEY
           AND SRC.StartDate > TRG.StartDate ) 
    where 1 = 1
    
    
    INSERT SANDBOX.EHN.TARGET_SCD2
    SELECT * FROM SANDBOX.EHN.SOURCE_SCD2