sqlsql-serverazure-synapse

How to get continous contract periods for customer


I am trying to get all continuous period of customers from their contract start and end dates in Synapse SQL allowing up to 30 day break between contracts.

I have been trying with LAG and Rownumber but each fails because when sorted with startdate overlapping contracts give poor results

WITH RankedContracts AS (
    SELECT 
        CustomerID,
        ContractID,
        StartDate,
        EndDate,
        -- Ensure correct order for computing PrevEndDate
        LAG(EndDate) OVER (PARTITION BY CustomerID ORDER BY StartDate, ContractID) AS PrevEndDate
    FROM Contracts
),
PeriodBreaks AS (
    SELECT 
        CustomerID,
        ContractID,
        StartDate,
        EndDate,
        -- Check if the current StartDate is NOT contiguous with the previous EndDate
        CASE 
            WHEN DATEDIFF(DAY, PrevEndDate, StartDate) > 30 THEN 1 
            ELSE 0 
        END AS IsNewPeriod
    FROM RankedContracts
)

Row 3 is problematic. It should not be a new period because it is within 30 days of row 1. Is there a way to compare to all previous rows instead of just the previous row?

Result with given code:

CustomerID ContractID StartDate EndDate IsNewPeriod
1 1 2019-01-01T00:00:00.0 2020-06-30T00:00:00.0 0
1 4 2019-04-23T00:00:00.0 2019-11-29T00:00:00.0 0
1 41 2020-07-01T00:00:00.0 2023-06-30T00:00:00.0 1
1 42 2020-12-08T00:00:00.0 2021-03-18T00:00:00.0 0
2 22 2020-07-01T00:00:00.0 2023-06-30T00:00:00.0 0
2 23 2024-01-08T00:00:00.0 2024-03-18T00:00:00.0 1
2 24 2024-04-01T00:00:00.0 2025-06-30T00:00:00.0 0

Wanted result should be like

CustomerID ContractID StartDate EndDate IsNewPeriod
1 1 2019-01-01T00:00:00.0 2020-06-30T00:00:00.0 0
1 4 2019-04-23T00:00:00.0 2019-11-29T00:00:00.0 0
1 41 2020-07-01T00:00:00.0 2023-06-30T00:00:00.0 0
1 42 2020-12-08T00:00:00.0 2021-03-18T00:00:00.0 0
2 22 2020-07-01T00:00:00.0 2023-06-30T00:00:00.0 0
2 23 2024-01-08T00:00:00.0 2024-03-18T00:00:00.0 1
2 24 2024-04-01T00:00:00.0 2025-06-30T00:00:00.0 0

Solution

  • I believe, that using the extended functionality of t-sql window functions will solve your problem. You can extend the viewed window by bounded ranges like here:

    WITH RankedContracts AS (
        SELECT 
            CustomerID,
            ContractID,
            StartDate,
            EndDate,
            -- Get the most recent EndDate that is within 30 days prior to the current StartDate
            MAX(EndDate) OVER (
                PARTITION BY CustomerID 
                ORDER BY StartDate, ContractID
                ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
            ) AS MaxPrevEndDate
        FROM Contracts
    ),
    PeriodBreaks AS (
        SELECT 
            CustomerID,
            ContractID,
            StartDate,
            EndDate,
            -- Check if the current StartDate is more than 30 days after the MaxPrevEndDate
            CASE 
                WHEN DATEDIFF(DAY, MaxPrevEndDate, StartDate) > 30 THEN 1 
                ELSE 0 
            END AS IsNewPeriod
        FROM RankedContracts
    )
    -- Calculate the final result
    SELECT 
        CustomerID,
        ContractID,
        StartDate,
        EndDate,
        SUM(IsNewPeriod) OVER (PARTITION BY CustomerID ORDER BY StartDate, ContractID) AS PeriodGroup
    FROM PeriodBreaks
    ORDER BY CustomerID, StartDate;
    
    

    This should retrieve the desired result