sqlsql-serversql-server-2008type-2-dimension

SQL to Build Type 2 Dimension


I need to build a type 2 dimension table to store price changes for various products. In the source data, there are two tables I can pull from. One has the current price of each product, and one has the price change history for each product. Prices for some products change more than others, and if a product's price has never changed, it won't have a record in the price change table at all.

Given the following current price table:

PRODUCT_ID  CURRENT_PRICE
----------  -------------
  ABC123        250
  DEF456        200
  GHI789        325

And product price history table:

PRODUCT_ID  OLD_PRICE   NEW_PRICE   CHANGE_DATE
----------  ---------   ---------   -----------
  ABC123        275       250        1/1/2016
  DEF456        250       225        6/1/2015
  DEF456        225       200        1/1/2016

What SQL could I run to populate a type 2 dimension as follows:

PRODUCT_ID  PRODUCT_PRICE   VALID_FROM  VALID_TO    CURRENT_PRICE_INDICATOR
----------  -------------   ----------  --------    ----------------------
  ABC123        275          1/1/1900   12/31/2015      N       
  ABC123        250          1/1/2016   12/31/9999      Y
  DEF456        250          1/1/1900   5/31/2015       N
  DEF456        225          6/1/2015   12/31/2015      N
  DEF456        200          1/1/2016   12/31/9999      Y
  GHI789        325          1/1/1900   12/31/9999      Y

Solution

  • I think it is something like this:

    DECLARE @price TABLE(PRODUCT_ID VARCHAR(100),CURRENT_PRICE DECIMAL(8,4));
    INSERT INTO @price VALUES
     ('ABC123',250)
    ,('DEF456',200)
    ,('GHI789',325);
    
    DECLARE @priceHist TABLE(PRODUCT_ID VARCHAR(100),OLD_PRICE DECIMAL(8,4),NEW_PRICE DECIMAL(8,4),CHANGE_DATE DATE);
    INSERT INTO @priceHist VALUES
     ('ABC123',275,250,{d'2016-01-01'})
    ,('DEF456',250,225,{d'2015-06-01'})
    ,('DEF456',225,200,{d'2016-01-01'});
    
    WITH AllData AS
    (
        SELECT ROW_NUMBER() OVER(PARTITION BY Combined.PRODUCT_ID ORDER BY ISNULL(Combined.CHANGE_DATE,{d'9999-12-31'}) ASC) AS Inx
              ,*
              ,CASE WHEN CHANGE_DATE IS NULL THEN 'Y' ELSE 'N' END AS CURRENT_PRICE_INDICATOR
        FROM
        (
            SELECT p.PRODUCT_ID AS PRODUCT_ID
                  ,p.CURRENT_PRICE AS PRODUCT_PRICE
                  ,NULL AS CHANGE_DATE
            FROM @price AS p
            UNION ALL
            SELECT ph.PRODUCT_ID
                  ,ph.OLD_PRICE
                  ,ph.CHANGE_DATE
            FROM @priceHist AS ph
        ) AS Combined
    )
    SELECT ad.PRODUCT_ID
          ,ad.PRODUCT_PRICE
          --Version with LAG (SQL Server 2012 and higher)
          --,CASE WHEN ad.Inx=1 THEN {d'1900-01-01'} ELSE LAG(ad.CHANGE_DATE,1) OVER(PARTITION BY ad.PRODUCT_ID ORDER BY ISNULL(ad.CHANGE_DATE,{d'9999-12-31'}) ASC) END AS VALID_FROM
          ,CASE WHEN ad.Inx=1 THEN {d'1900-01-01'} ELSE LAG_Replace_For_SQLServer2008.CHANGE_DATE END AS VALID_FROM
          ,CASE WHEN ad.CURRENT_PRICE_INDICATOR='Y' THEN {d'9999-12-31'} ELSE DATEADD(DAY,-1,ad.CHANGE_DATE) END AS VALID_TO
          ,ad.CURRENT_PRICE_INDICATOR
    FROM AllData AS ad
    OUTER APPLY
    (
        SELECT x.CHANGE_DATE 
        FROM AllData AS x
        WHERE x.PRODUCT_ID=ad.PRODUCT_ID
          AND x.Inx=ad.Inx-1
    ) LAG_Replace_For_SQLServer2008
    

    The result:

    ABC123  275.0000    1900-01-01  2015-12-31  N
    ABC123  250.0000    2016-01-01  9999-12-31  Y
    DEF456  250.0000    1900-01-01  2015-05-31  N
    DEF456  225.0000    2015-06-01  2015-12-31  N
    DEF456  200.0000    2016-01-01  9999-12-31  Y
    GHI789  325.0000    1900-01-01  9999-12-31  Y