If this query:
SELECT CONCAT(SOURCE.OrderNo, '_', SOURCE.OrderLine),
SOURCE.ProdOrder,
SOURCE.Lvl1,
SOURCE.Lvl2,
SOURCE.Lvl3,
SOURCE.LastDate
FROM dbo.SourceTbl AS SOURCE
returns 11 records and this query:
SELECT CONCAT(TARGET.OrderNo, '_', TARGET.OrderLine),
TARGET.ProdOrder,
TARGET.Lvl1,
TARGET.Lvl2,
TARGET.Lvl3,
TARGET.LastDate
FROM dbo.TargetTbl AS TARGET
returns 17 records and the INTERSECT between the two:
SELECT CONCAT(SOURCE.OrderNo, '_', SOURCE.OrderLine),
SOURCE.ProdOrder,
SOURCE.Lvl1,
SOURCE.Lvl2,
SOURCE.Lvl3,
SOURCE.LastDate
FROM dbo.SourceTbl AS SOURCE
INTERSECT
SELECT CONCAT(TARGET.OrderNo, '_', TARGET.OrderLine),
TARGET.ProdOrder,
TARGET.Lvl1,
TARGET.Lvl2,
TARGET.Lvl3,
TARGET.LastDate
FROM dbo.TargetTbl AS TARGET
returns 9 records, when I do a MERGE like this:
MERGE dbo.TargetTbl AS TARGET
USING (
SELECT OrderNo, OrderLine, CONCAT(OrderNo, '_', OrderLine) AS OrderNoLine, SomeModel, ProdOrder, Lvl1, Lvl2, Lvl3,
MAX(LastDate) AS LastDate
FROM dbo.SourceTbl
GROUP BY OrderNo, OrderLine, CONCAT(OrderNo, '_', OrderLine), SomeModel, ProdOrder, Lvl1, Lvl2, Lvl3
) AS SOURCE
ON CONCAT(TARGET.OrderNo, '_', TARGET.OrderLine) = OrderNoLine
AND TARGET.ProdOrder = SOURCE.ProdOrder
AND TARGET.Lvl1 = SOURCE.Lvl1
AND TARGET.Lvl2 = SOURCE.Lvl2
AND TARGET.Lvl3 = SOURCE.Lvl3
AND TARGET.LastDate = SOURCE.LastDate
WHEN MATCHED AND EXISTS (SELECT CONCAT(SOURCE.OrderNo, '_', SOURCE.OrderLine)
,SOURCE.ProdOrder
,SOURCE.Lvl1
,SOURCE.Lvl2
,SOURCE.Lvl3
,SOURCE.LastDate
INTERSECT
SELECT CONCAT(TARGET.OrderNo, '_', TARGET.OrderLine)
,TARGET.ProdOrder
,TARGET.Lvl1
,TARGET.Lvl2
,TARGET.Lvl3
,TARGET.LastDate
)
THEN UPDATE SET TARGET.IsBlocked = 1, TARGET.BlockDate = GETDATE()
WHEN NOT MATCHED BY TARGET
THEN INSERT (LastDate, UsrID, DepID, OrderNo, OrderLine, SomeModel, ProdOrder, Lvl1, Lvl2, Lvl3, IsBlocked, BlockDate)
VALUES (SOURCE.LastDate, 999, 999, SOURCE.OrderNo, SOURCE.OrderLine, SOURCE.SomeModel, SOURCE.ProdOrder, SOURCE.Lvl1, SOURCE.Lvl2, SOURCE.Lvl3, 1, GETDATE());
it should, according to this and this, UPDATE the 9 INTERSECT records of the TargetTbl and INSERT to that same table the remaining 2 records from the SourceTbl (11 in total). Instead it updates 4 records and inserts 6 records (10 in total). Two records in the SourceTbl are duplicated and that's the reason of 10 instead of 11 and that's also why I used the MAX & GROUP BY.
I think it's the first part of the query, the USING part, that cannot handle NULLs correctly even if the INTERSECT part does its job. I tried everything I was able to, but no success. I'm sure it's something easily doable, so please, help me. Thank you.
EDIT: SourceTbl data by using SELECT OrderNo, OrderLine, CONCAT(OrderNo, '_', OrderLine) AS OrderNoLine, SomeModel, ProdOrder, Lvl1, Lvl2, Lvl3, LastDate AS LastDate FROM dbo.SourceTbl ORDER BY OrderNo, OrderLine, SomeModel, ProdOrder
, irrelevant columns are omitted:
OrderNo OrderLine OrderNoLine SomeModel ProdOrder Lvl1 Lvl2 Lvl3 LastDate
123c08637 10 123c08637_10 4321525175_004321 A5C008837 Abcd Efgh Olol 04/03/2030
123c11214 10 123c11214_10 4321532622_000391 NULL NULL NULL NULL 07/07/2018
123c13039 10 123c13039_10 4321525175_002611 A5C014838 NULL NULL NULL 18/05/2018
123c16059 10 123c16059_10 4321541488_001111 A5C018611 NULL NULL NULL 18/05/2018
123c17482 10 123c17482_10 4321506480_001711 A5C019227 Asdf Ghjk Cvnm 12/12/2018
123c17482 10 123c17482_10 4321506480_001711 A5C047712 Asdf Ghjk Cvnm 12/12/2018
123c17482 20 123c17482_20 4321506480_001712 A5B072554 aaaa bbbb cccc 18/05/2018
123c17482 20 123c17482_20 4321506480_001712 A5B072554 aaaa bbbb cccc 18/05/2018
123c17482 20 123c17482_20 4321506480_001712 A5B072554 aaaa bbbb xxxx 18/05/2018
123c17482 20 123c17482_20 4321506480_001712 A5B200472 NULL NULL NULL 18/05/2018
123c32405 10 123c32405_10 8765525667_005301 NULL Qwer Uiop Tygh 12/12/2018
The GROUP BY might reduce the number of records to only one (if the 11 records only differ in the LastDate column and if SomeModel contains the same value for all 11 records) or it might result in all 11 records (if SomeModel contains unique values), so that GROUP BY does not neccessarily resturn the 10 distinct rows. To achieve this, use SELECT DISTINCT instead of grouping by a subset of the columns.
Also, if the ON condition worked as you seem to expect it, the additional EXISTS condition was obsolete. Obviously, 4 matches are found and 6 records have no match. Within these 6, there could be 2 records that indeed have no match and 4 records that do not match because of NULL values.
To take care of NULL values, I suggest to change the whole statement to something like this:
MERGE dbo.TargetTbl AS TARGET
USING (
SELECT DISTINCT OrderNo, OrderLine, ProdOrder, Lvl1, Lvl2, Lvl3, LastDate
FROM dbo.SourceTbl
) AS SOURCE
ON (TARGET.OrderNo = SOURCE.OrderNo OR TARGET.OrderNo IS NULL AND SOURCE.OrderNo IS NULL)
AND (TARGET.OrderLine = SOURCE.OrderLine OR TARGET.OrderLine IS NULL AND SOURCE.OrderLine IS NULL)
AND (TARGET.ProdOrder = SOURCE.ProdOrder OR TARGET.ProdOrder IS NULL AND SOURCE.ProdOrder IS NULL)
AND (TARGET.Lvl1 = SOURCE.Lvl1 OR TARGET.Lvl1 IS NULL AND SOURCE.Lvl1 IS NULL)
AND (TARGET.Lvl2 = SOURCE.Lvl2 OR TARGET.Lvl2 IS NULL AND SOURCE.Lvl2 IS NULL)
AND (TARGET.Lvl3 = SOURCE.Lvl3 OR TARGET.Lvl3 IS NULL AND SOURCE.Lvl3 IS NULL)
AND (TARGET.LastDate = SOURCE.LastDate OR TARGET.LastDate IS NULL AND SOURCE.LastDate IS NULL)
WHEN MATCHED
THEN UPDATE SET TARGET.IsBlocked = 1, TARGET.BlockDate = GETDATE()
WHEN NOT MATCHED BY TARGET
THEN INSERT (LastDate, UsrID, DepID, OrderNo, OrderLine, SomeModel, ProdOrder, Lvl1, Lvl2, Lvl3, IsBlocked, BlockDate)
VALUES (LastDate, 999, 999, OrderNo, OrderLine, SomeModel, ProdOrder, Lvl1, Lvl2, Lvl3, 1, GETDATE());