I have two tables. Staging table refreshes every day and replaces LoadDate
column with today's date. This data gets inserted into Final table where I want to insert just the new data.
Staging table:
MachineName | ApplicationName | LoadDate |
---|---|---|
A | Excel | 22-02-2023 |
B | Excel | 22-02-2023 |
C | PowerPoint | 22-02-2023 |
D | Word | 22-02-2023 |
A | Word | 22-02-2023 |
C | Word | 22-02-2023 |
Final table:
MachineName | ApplicationName | LoadDate |
---|---|---|
A | Excel | 20-02-2023 |
B | Excel | 20-02-2023 |
C | PowerPoint | 21-02-2023 |
A | Word | 21-02-2023 |
I have written my query like this:
INSERT INTO [Final]
SELECT *
FROM [Staging]
WHERE NOT EXISTS (SELECT 1 FROM [FINAL]
WHERE [Final].[MachineName] = [Staging].[MachineName]
AND [Final].[ApplicationName] = [Staging].[ApplicationName])
E.g in the Final table, A-Excel, B-Excel, C-PowerPoint and A-Word were already present in the table so they stay as they are. In the Staging table C-Word and D-Word is the new value and this needs to get inserted into the Final table. Although I have written the query to the best of my knowledge, I am getting duplicate values in the Final table which should not happen. What have I done wrong?
Your query should work as you expect it to. Are you certain that you supplied the SQL that is causing duplication?
As an alternative, here is a MERGE statement:
MERGE Final AS tgt
USING Staging AS src
ON tgt.MachineName=src.MachineName
AND tgt.ApplicationName=src.ApplicationName
WHEN NOT MATCHED BY TARGET THEN
INSERT(MachineName, ApplicationName, LoadDate)
VALUES(MachineName,ApplicationName,LoadDate);