sqlteradatagaps-and-islandsteradatasql

Changing data structure from "Date" to "From_Date" and "To_Date" based on ID and Flag


I have IDs, their as at date (end of month format), and their corresponding flags. I only care if the flag is Y.

It is in the following format:

ID as_at_date Enabled_Flag
1 31/01/2025 Y
1 28/02/2025 Y
1 31/03/2025 Y
1 30/06/2025 Y
2 31/01/2025 Y
2 28/02/2025 Y
2 31/03/2025 Y
2 31/04/2025 Y
2 30/06/2025 Y
2 30/09/2025 Y

As you can see, ID 1's flag is enabled from Jan-March. Because it does not have entries for April and May, it is disabled in those months, but is enabled again in June. Similar for ID 2.

I want to make the data in the following format using SQL (Teradata SQL preferably but any SQL is workable). Where the to_date is "01/01/3000", it is to indicate it is the current/most recent record.

ID from_date to_date Enabled_Flag
1 31/01/2025 31/03/2025 Y
1 30/06/2025 01/01/3000 Y
2 31/01/2025 31/04/2025 Y
2 30/06/2025 30/06/2025 Y
2 30/09/2025 01/01/3000 Y

Using a MIN()/MAX() on the data doesn't work because it'll just take the MAX() date but doesn't indicate to me if at any point an ID left the 'Y' population at any time. Please help me


Solution

  • For Teradata, this can be solved using period data type and normalize. Since your as_of_date is always the end of the month, we can create a period column for each row using the first of the month through the as_of date, unless the dates aren't contiguous (which we determine using lead). In that case, we use the as_of_date as the beginning, and 3000-01-01 as the end. I've tried to explain things in sql comments. Based on your very limited sample data, this looks to work.

    select
        id,
        begin(prd) as from_date,
        end(prd) -1 as end_date,
        enabled_flag
    from (
        select normalize
        id,
        enabled_flag,
        period(
            --get a beginning date
            --if the dates are contiguous, use the first day of the month
            --if not, use the as_of_date
            case when lead (trunc(as_of_date,'MONTH')) over (partition by id order by as_of_date)is null then as_of_date else  trunc(as_of_date,'MONTH') end,
            --get an end date
            --if the dates are contiguous, use the as_of_Date + 1 day 
                --(normalize/period the start date in inclusive, the end date is exclusive)
            --if not, use 3000-01-01'   
            case when lead (trunc(as_of_date,'MONTH')) over (partition by id order by as_of_date)is null then date '3000-01-01' else as_of_date + 1 end) as prd
        from
        <your table>
    ) t
    order by 1,2