I have a dataset as below from which I would like to draw some inferences.
Id | Nbr | Dt | Status | Cont1Sta1 | DateLagInDays | Recurrence |
---|---|---|---|---|---|---|
1 | 2 | 2023-10-1 | 1 | |||
1 | 2 | 2023-11-2 | 0 | |||
1 | 2 | 2023-12-13 | 0 | |||
1 | 3 | 2023-10-1 | 0 | |||
1 | 3 | 2023-11-2 | 0 | |||
1 | 3 | 2023-12-13 | 1 | |||
1 | 9 | 2023-11-1 | 0 | |||
1 | 9 | 2023-12-1 | 1 | |||
1 | 9 | 2024-1-1 | 0 |
I have already created a SQL server data partition based on ID, number
, and ordered by dt
in ascending order.
The inferences needed are:
Is this possible using partition basic logic? I did try using lag and lead within the partition, but it would not yield a good result.
Any suggestion to write a good concise code will be helpful.
While I tried few other things, I have below clean code that I am using so far. I would like to have the columns Cont1Sta1, DateLagInDays, Recurrence be filled per partition after transformation, in the first instance row.
declare @t table
(
id int,
nbr int,
dt date,
status smallint,
Cont1Sta1 bit,--if the chosen partition has atleast one status=1
DateLagInDays int,--date diff in days from earliest record within partition to when status=1
Recurrence bit --does partition has atleast one new row after one possible row that has status=1
)
insert into @t(id,nbr,dt,status) select 1,9,'2023-11-1',0
insert into @t(id,nbr,dt,status) select 1,9,'2023-12-1',1
insert into @t(id,nbr,dt,status) select 1,9,'2024-1-1',0
insert into @t(id,nbr,dt,status) select 1,2,'2023-10-1',1
insert into @t(id,nbr,dt,status) select 1,2,'2023-11-2',0
insert into @t(id,nbr,dt,status) select 1,2,'2023-12-13',0
insert into @t(id,nbr,dt,status) select 1,3,'2023-10-1',0
insert into @t(id,nbr,dt,status) select 1,3,'2023-11-2',0
insert into @t(id,nbr,dt,status) select 1,3,'2023-12-13',1;
select
id, nbr, dt,status,
rank() over (partition by id, nbr, status order by id, nbr, dt asc, status desc) rownbr
from
@t
order by
id, nbr, dt asc, status desc;
Here is the desired tabular result:
Id | Nbr | Dt | Status | Cont1Sta1 | DateLagInDays | Recurrence |
---|---|---|---|---|---|---|
1 | 2 | 2023-10-1 | 0 | 1 | 32 | 1 |
1 | 2 | 2023-11-2 | 1 | 0 | 0 | 0 |
1 | 2 | 2023-12-13 | 0 | 0 | 0 | 0 |
1 | 3 | 2023-10-1 | 0 | 1 | 73 | 0 |
1 | 3 | 2023-11-2 | 0 | 0 | 0 | 0 |
1 | 3 | 2023-12-13 | 1 | 0 | 0 | 0 |
1 | 9 | 2023-11-1 | 1 | 1 | 0 | 1 |
1 | 9 | 2023-12-1 | 0 | 0 | 0 | 0 |
1 | 9 | 2024-1-1 | 0 | 0 | 0 | 0 |
After some clarification, it seems you only want the first row per group. You can use row-numbering for that.
The rest of your calculations can be done with conditional aggregation using window functions.
WITH cte AS (
SELECT
t.id,
t.nbr,
t.dt,
t.status,
COUNT(CASE WHEN t.status = 1 THEN 1 END) OVER (PARTITION BY t.ID, t.nbr) AS Cont1Sta1,
ROW_NUMBER() OVER (PARTITION BY t.ID, t.nbr ORDER BY t.dt) AS rn,
DATEDIFF(day, t.dt, MIN(CASE WHEN t.status = 1 THEN t.dt END) OVER (PARTITION BY t.ID, t.nbr)) AS DateLagInDays,
IIF(
MAX(t.dt) OVER (PARTITION BY t.ID, t.nbr) >
MIN(CASE WHEN t.status = 1 THEN t.dt END) OVER (PARTITION BY t.ID, t.nbr),
1, 0) AS Recurrence
FROM @t t
)
SELECT *
FROM cte
WHERE rn = 1;