I have IDs, their as at date (end of month format), and their corresponding flags. I only care if the flag is Y.
It is in the following format:
ID | as_at_date | Enabled_Flag |
---|---|---|
1 | 31/01/2025 | Y |
1 | 28/02/2025 | Y |
1 | 31/03/2025 | Y |
1 | 30/06/2025 | Y |
2 | 31/01/2025 | Y |
2 | 28/02/2025 | Y |
2 | 31/03/2025 | Y |
2 | 31/04/2025 | Y |
2 | 30/06/2025 | Y |
2 | 30/09/2025 | Y |
As you can see, ID 1's flag is enabled from Jan-March. Because it does not have entries for April and May, it is disabled in those months, but is enabled again in June. Similar for ID 2.
I want to make the data in the following format using SQL (Teradata SQL preferably but any SQL is workable). Where the to_date is "01/01/3000", it is to indicate it is the current/most recent record.
ID | from_date | to_date | Enabled_Flag |
---|---|---|---|
1 | 31/01/2025 | 31/03/2025 | Y |
1 | 30/06/2025 | 01/01/3000 | Y |
2 | 31/01/2025 | 31/04/2025 | Y |
2 | 30/06/2025 | 30/06/2025 | Y |
2 | 30/09/2025 | 01/01/3000 | Y |
Using a MIN()/MAX() on the data doesn't work because it'll just take the MAX() date but doesn't indicate to me if at any point an ID left the 'Y' population at any time. Please help me
For Teradata, this can be solved using period data type and normalize. Since your as_of_date is always the end of the month, we can create a period column for each row using the first of the month through the as_of date, unless the dates aren't contiguous (which we determine using lead). In that case, we use the as_of_date as the beginning, and 3000-01-01 as the end. I've tried to explain things in sql comments. Based on your very limited sample data, this looks to work.
select
id,
begin(prd) as from_date,
end(prd) -1 as end_date,
enabled_flag
from (
select normalize
id,
enabled_flag,
period(
--get a beginning date
--if the dates are contiguous, use the first day of the month
--if not, use the as_of_date
case when lead (trunc(as_of_date,'MONTH')) over (partition by id order by as_of_date)is null then as_of_date else trunc(as_of_date,'MONTH') end,
--get an end date
--if the dates are contiguous, use the as_of_Date + 1 day
--(normalize/period the start date in inclusive, the end date is exclusive)
--if not, use 3000-01-01'
case when lead (trunc(as_of_date,'MONTH')) over (partition by id order by as_of_date)is null then date '3000-01-01' else as_of_date + 1 end) as prd
from
<your table>
) t
order by 1,2