sql-servermergedata-warehouseetltype-2-dimension

Tracking what the MERGE command and its OUTPUT did


I am modifying a Type 2 dimension using the following (long) SQL statement:

INSERT INTO AtlasDataWarehouseReports.District
(
    Col01,
    Col02,
    Col03,
    Col04,
    Col05,
    Col06,
    Col07,
    Col08,
    Col09,
    Col10,
    StartDateTime,
    EndDateTime
)
SELECT
    Col01,
    Col02,
    Col03,
    Col04,
    Col05,
    Col06,
    Col07,
    Col08,
    Col09,
    Col10,
    CONVERT (DATETIME, CONVERT (Varchar, GetDate(), 101)) AS StartDateTime,
    NULL AS EndDateTime
FROM 
(
    MERGE AtlasDataWarehouseReports.District AS MergeTarget

    USING Staging.District as MergeSource
        ON MergeTarget.Col01 = MergeSource.Col01
        AND MergeTarget.EndDateTime IS NULL

    WHEN MATCHED 
        AND (
                MergeTarget.Col02 <> MergeSource.Col02
                OR MergeTarget.Col05 <> MergeSource.Col05
            )
    THEN
        UPDATE SET MergeTarget.EndDateTime = CONVERT (DATETIME, CONVERT (Varchar, GetDate(), 101))

    WHEN NOT MATCHED 
    THEN
        INSERT 
        (
            Col01,
            Col02,
            Col03,
            Col04,
            Col05,
            Col06,
            Col07,
            Col08,
            Col09,
            Col10,
            StartDateTime,
            EndDateTime
        )
        VALUES 
        (
            MergeSource.Col01,
            MergeSource.Col02,
            MergeSource.Col03,
            MergeSource.Col04,
            MergeSource.Col05,
            MergeSource.Col06,
            MergeSource.Col07,
            MergeSource.Col08,
            MergeSource.Col09,
            MergeSource.Col10,
            CONVERT (DATETIME, CONVERT (Varchar, GetDate(), 101)),
            NULL
        )
    OUTPUT $Action as MergeAction, MergeSource.*
) AS MergeOutput
WHERE 1=1
    AND MergeOutput.MergeAction = 'UPDATE';

I am running this as a part of the ETL that loads my data warehouse. What I want to build in is a detailed logging system that can track all the changes in a log table/file etc.

All the work that is actually done during the MERGE and the OUTPUT used in the INSERT, is behind the scenes. I want to track all the columns and values that participated in this query.

Is there any way for me to capture this data?


Solution

  • I believe the disadvantage of using the T-SQL MERGE statement here is that you will obscure what is actually happening and so introducing logging may require duplicated effort.

    Just my thoughts but you seem to be re-inventing the wheel a little here.

    SQL Server Integration Services (SSIS) offers pre-built components for these exact processing requirements.

    For example, you can use the existing Slowly Changing Dimension Transformation component, or you can use a number of lower level components to implement your own custom solution but subsequently incorporate logging.