sqlsql-serverdata-partitioning

Perform determinations within a data partition


I have a dataset as below from which I would like to draw some inferences.

Id Nbr Dt Status Cont1Sta1 DateLagInDays Recurrence
1 2 2023-10-1 1
1 2 2023-11-2 0
1 2 2023-12-13 0
1 3 2023-10-1 0
1 3 2023-11-2 0
1 3 2023-12-13 1
1 9 2023-11-1 0
1 9 2023-12-1 1
1 9 2024-1-1 0

I have already created a SQL server data partition based on ID, number, and ordered by dt in ascending order.

The inferences needed are:

  1. For the chosen partition are there any instances with status = 1.
  2. If #1 is true, the date difference between the earliest of the instances where status=0 until the row has status=1.
  1. Within the partition, are there any new rows disregarding status value after there is at least one row with status=1?

Is this possible using partition basic logic? I did try using lag and lead within the partition, but it would not yield a good result.

Any suggestion to write a good concise code will be helpful.

While I tried few other things, I have below clean code that I am using so far. I would like to have the columns Cont1Sta1, DateLagInDays, Recurrence be filled per partition after transformation, in the first instance row.

declare @t table
(
    id int,
    nbr int,
    dt date,
    status smallint,
    Cont1Sta1 bit,--if the chosen partition has atleast one status=1
    DateLagInDays int,--date diff in days from earliest record within partition to when status=1
    Recurrence bit --does partition has atleast one new row after one possible row that has status=1
)
insert into @t(id,nbr,dt,status) select 1,9,'2023-11-1',0
insert into @t(id,nbr,dt,status) select 1,9,'2023-12-1',1
insert into @t(id,nbr,dt,status) select 1,9,'2024-1-1',0
insert into @t(id,nbr,dt,status) select 1,2,'2023-10-1',1
insert into @t(id,nbr,dt,status) select 1,2,'2023-11-2',0
insert into @t(id,nbr,dt,status) select 1,2,'2023-12-13',0
insert into @t(id,nbr,dt,status) select 1,3,'2023-10-1',0
insert into @t(id,nbr,dt,status) select 1,3,'2023-11-2',0
insert into @t(id,nbr,dt,status) select 1,3,'2023-12-13',1;

select 
    id, nbr, dt,status,
    rank() over (partition by id, nbr, status order by id, nbr, dt asc, status desc) rownbr
from 
    @t
order by 
    id, nbr, dt asc, status desc;

Here is the desired tabular result:

Id Nbr Dt Status Cont1Sta1 DateLagInDays Recurrence
1 2 2023-10-1 0 1 32 1
1 2 2023-11-2 1 0 0 0
1 2 2023-12-13 0 0 0 0
1 3 2023-10-1 0 1 73 0
1 3 2023-11-2 0 0 0 0
1 3 2023-12-13 1 0 0 0
1 9 2023-11-1 1 1 0 1
1 9 2023-12-1 0 0 0 0
1 9 2024-1-1 0 0 0 0

Solution

  • After some clarification, it seems you only want the first row per group. You can use row-numbering for that.

    The rest of your calculations can be done with conditional aggregation using window functions.

    WITH cte AS (
        SELECT
          t.id,
          t.nbr,
          t.dt,
          t.status,
          COUNT(CASE WHEN t.status = 1 THEN 1 END) OVER (PARTITION BY t.ID, t.nbr) AS Cont1Sta1,
          ROW_NUMBER() OVER (PARTITION BY t.ID, t.nbr ORDER BY t.dt) AS rn,
          DATEDIFF(day, t.dt, MIN(CASE WHEN t.status = 1 THEN t.dt END) OVER (PARTITION BY t.ID, t.nbr)) AS DateLagInDays,
          IIF(
              MAX(t.dt) OVER (PARTITION BY t.ID, t.nbr) >
              MIN(CASE WHEN t.status = 1 THEN t.dt END) OVER (PARTITION BY t.ID, t.nbr),
            1, 0) AS Recurrence
        FROM @t t
    )
    SELECT *
    FROM cte
    WHERE rn = 1;
    

    db<>fiddle