sqlteradataminwindowing

Teradata Windows analytical function to select the correct records from partition


I have just 2 columns for partition and on basis of those two columns, need to select the appropriate dates.

Input Data:

strt_dt end_dt ID Amt
2023-09-29 2023-10-01 ID1 100
2023-10-02 2023-10-02 ID1 100
2023-10-03 2023-10-03 ID1 100
2023-10-04 2023-10-04 ID1 200
2023-10-05 2023-10-05 ID1 100

Expected Output:

strt_dt end_dt ID Amt
2023-09-29 2023-10-03 ID1 100
2023-10-04 2023-10-04 ID1 200
2023-10-05 2023-10-05 ID1 100

Actual Output:

strt_dt end_dt ID Amt
2023-09-29 2023-10-05 ID1 100
2023-10-04 2023-10-04 ID1 200

I have used min(strt_dt) OVER (PARTITION by id, amt ORDER BY strt_dt) which is the cause of the overlapping results.

Do I need to do anything differently?


Solution

  • It can be achieve in 2 ways, I will provide both simple and complex versions.

    ROW_NUMBER() will work for the requirement, however your need to use it twice (first without amount and then with amount). The below approach is a simple version,

    WITH consecutive_blocks AS (
        SELECT
            strt_dt,
            end_dt,
            ID,
            Amt,
            ROW_NUMBER() OVER (PARTITION BY ID ORDER BY strt_dt) AS rn1,
            ROW_NUMBER() OVER (PARTITION BY ID, Amt ORDER BY strt_dt) AS rn2
        FROM 
            table_a
    )
    SELECT 
        MIN(strt_dt) AS strt_dt,
        MAX(end_dt) AS end_dt,
        ID,
        Amt
    FROM 
        consecutive_blocks
    GROUP BY 
        ID, Amt, (rn1 - rn2)
    ORDER BY 
        strt_dt;
    
    

    There is an efficient way of achieving the same result, but it will be a more complex one,

    WITH flagged_rows AS (
        SELECT
            strt_dt,
            end_dt,
            ID,
            Amt,
            CASE 
                WHEN LAG(Amt) OVER (PARTITION BY ID ORDER BY strt_dt) = Amt THEN 0 
                ELSE 1 
            END AS flag
        FROM 
            table_a
    ),
    grouped_rows AS (
        SELECT 
            strt_dt,
            end_dt,
            ID,
            Amt,
            SUM(flag) OVER (PARTITION BY ID ORDER BY strt_dt ROWS UNBOUNDED PRECEDING) AS grp
        FROM 
            flagged_rows
    )
    SELECT 
        MIN(strt_dt) AS strt_dt,
        MAX(end_dt) AS end_dt,
        ID,
        Amt
    FROM 
        grouped_rows
    GROUP BY 
        ID, Amt, grp
    ORDER BY 
        strt_dt;