I am trying to load a standard Kimball SCD2 Dimension, using a merge statement which I got from the following website: http://www.kimballgroup.com/2008/11/design-tip-107-using-the-sql-merge-statement-for-slowly-changing-dimension-processing/
This merge statement is the same except to handle new entities. This will be handled as a direct insert in the dataflow. This problem concerns only multiple versions of the same business key.
When I execute the merge statement SQL returns the error:
Msg 8672, Level 16, State 1, Line 3
The MERGE statement attempted to UPDATE or DELETE the same row more than once. This happens when a target row matches more than one source row.
A MERGE statement cannot UPDATE/DELETE the same row of the target table multiple times. Refine the ON clause to ensure a target row matches at most one source row, or use the GROUP BY clause to group the source rows.
I am using SQL Server 2012:
SOURCE DATASET
TARGET DATASET
This is what I expected:
Below you can find the script to reproduce the problem:
CREATE TABLE SANDBOX.EHN.SOURCE_SCD2 (
BUSINESS_KEY BIGINT
,DESCRIPTION_A VARCHAR(2)
,M_CRC BIGINT
,StartDATE DATE
,EndDATE DATE )
CREATE TABLE SANDBOX.EHN.TARGET_SCD2 (
BUSINESS_KEY BIGINT
,DESCRIPTION_A VARCHAR(2)
,M_CRC BIGINT
,StartDATE DATE
,EndDATE DATE )
select *
from SANDBOX.EHN.TARGET_SCD2
truncate table SANDBOX.EHN.TARGET_SCD2
INSERT INTO SANDBOX.EHN.SOURCE_SCD2 VALUES (1, 'B', 1, '2015-05-16', '2015-06-01')
INSERT INTO SANDBOX.EHN.SOURCE_SCD2 VALUES (1, 'C', 2, '2015-06-01', '2015-06-11')
INSERT INTO SANDBOX.EHN.SOURCE_SCD2 VALUES (1, 'D', 3, '2015-06-11', '9999-12-31')
INSERT INTO SANDBOX.EHN.TARGET_SCD2 VALUES (1, 'A', 0, '2015-01-16', '9999-12-31')
INSERT INTO SANDBOX.EHN.TARGET_SCD2
SELECT BUSINESS_KEY
,DESCRIPTION_A
,M_CRC
,StartDATE
,EndDATE
FROM (
MERGE SANDBOX.EHN.TARGET_SCD2 D
USING SANDBOX.EHN.SOURCE_SCD2 UPD
ON(D.BUSINESS_KEY = UPD.BUSINESS_KEY )
WHEN MATCHED AND D.EndDATE = '9999-12-31'
THEN UPDATE SET D.EndDATE = UPD.EndDATE
OUTPUT $Action Action_Out, UPD.BUSINESS_KEY
, UPD.DESCRIPTION_A
, UPD.M_CRC
, UPD.StartDATE
, UPD.EndDATE
)AS MERGE_OUT
WHERE MERGE_OUT.Action_Out = 'UPDATE'
Can you help me to fix this problem?
For the last update only use;
INSERT INTO SANDBOX.EHN.TARGET_SCD2
SELECT BUSINESS_KEY
,DESCRIPTION_A
,M_CRC
,StartDATE
,EndDATE
FROM (
MERGE SANDBOX.EHN.TARGET_SCD2 D
USING SANDBOX.EHN.SOURCE_SCD2 UPD
ON(D.BUSINESS_KEY = UPD.BUSINESS_KEY AND UPD.EndDATE = '9999-12-31')
WHEN MATCHED AND D.EndDATE = '9999-12-31'
THEN UPDATE SET D.EndDATE = UPD.StartDATE
OUTPUT $Action Action_Out, UPD.BUSINESS_KEY
, UPD.DESCRIPTION_A
, UPD.M_CRC
, UPD.StartDATE
, UPD.EndDATE
)AS MERGE_OUT
WHERE MERGE_OUT.Action_Out = 'UPDATE'
If you want ALL SRC rows in your target table then I agree with Nick.McDermaid
For ALL rows use;
UPDATE TRG
SET TRG.EndDate = SRC.StartDATE
FROM SANDBOX.EHN.TARGET_SCD2 TRG
JOIN ( select SRC.BUSINESS_KEY, min(src.StartDATE)StartDATE
from SANDBOX.EHN.SOURCE_SCD2 SRC
group by SRC.BUSINESS_KEY
)SRC
on ( TRG.BUSINESS_KEY = SRC.BUSINESS_KEY
AND SRC.StartDate > TRG.StartDate )
where 1 = 1
INSERT SANDBOX.EHN.TARGET_SCD2
SELECT * FROM SANDBOX.EHN.SOURCE_SCD2