sqlsql-serverrow-number

How do you get ROW_NUMBER to start over at different value between equal values in the partition?


Please see below for what I have and what I would like to have. I have tried a subquery using LEAD and LAG to create a flag for partitioning but couldn't get that to work. Thanks.

CREATE TABLE #t (PackedStr VARCHAR(100), Ident INT, BeginDate DATE, EndDate DATE);
INSERT #t
VALUES ('A,B,C,D,E', 86, '2019-03-18', '2019-03-27')
     , ('A,B,C,D,E', 87, '2019-03-28', '2019-04-09')
     , ('A,B,C,D,E,F,G', 88, '2019-04-10', '2019-04-15')
     , ('A,B,C,D,E', 89, '2019-04-16', '2019-04-24')
     , ('A,B,C,D,E', 90, '2019-04-25', '2019-05-14');

SELECT *
     , ROW_NUMBER() OVER (PARTITION BY [PackedStr] ORDER BY [Ident]) RowNumber
     , MIN( BeginDate ) OVER (PARTITION BY [PackedStr] ORDER BY [Ident]) FirstDate
     , MAX( EndDate ) OVER (PARTITION BY [PackedStr] ORDER BY [Ident] DESC) LastDate
FROM [#t]
ORDER BY [Ident];

/*
--Current results
PackedStr      Ident  BeginDate  EndDate    RowNumber  FirstDate  LastDate
-------------- ------ ---------- ---------- ---------- ---------- ----------
A,B,C,D,E      86     2019-03-18 2019-03-27 1          2019-03-18 2019-05-14
A,B,C,D,E      87     2019-03-28 2019-04-09 2          2019-03-18 2019-05-14
A,B,C,D,E,F,G  88     2019-04-10 2019-04-15 1          2019-04-10 2019-04-15
A,B,C,D,E      89     2019-04-16 2019-04-24 3          2019-03-18 2019-05-14
A,B,C,D,E      90     2019-04-25 2019-05-14 4          2019-03-18 2019-05-14

--Desired results should be:
PackedStr      Ident  BeginDate  EndDate    RowNumber  FirstDate  LastDate
-------------- ------ ---------- ---------- ---------- ---------- ----------
A,B,C,D,E      86     2019-03-18 2019-03-27 1          2019-03-18 2019-04-09
A,B,C,D,E      87     2019-03-28 2019-04-09 2          2019-03-18 2019-04-09
A,B,C,D,E,F,G  88     2019-04-10 2019-04-15 1          2019-04-10 2019-04-15
A,B,C,D,E      89     2019-04-16 2019-04-24 1          2019-04-16 2019-05-14
A,B,C,D,E      90     2019-04-25 2019-05-14 2          2019-04-16 2019-05-14
*/

Solution

  • For your initial idea - using lag(...) Solve like a regular task gaps an islands.
    Each change in PackedStr (ordered by Ident) is gap.
    Calculate number of gap as sum(isGap) - gapN.
    Calculate row_number() within gap - rn.
    See example

    select *
      ,row_number()over(partition by gapN order by Ident) rn
      ,MIN( BeginDate ) OVER (PARTITION BY gapN) FirstDate
      ,MAX( EndDate ) OVER (PARTITION BY gapN) LastDate
    from(
       select * ,sum(isGap)over(order by Ident)gapN
       from(
         select *
           ,case when PackedStr=lag(PackedStr)over(order by Ident) then 0 else 1 end as isGap
         from #t
       )a
    )b
    
    PackedStr Ident BeginDate EndDate isGap gapN rn FirstDate LastDate
    A,B,C,D,E 86 2019-03-18 2019-03-27 1 1 1 2019-03-18 2019-04-09
    A,B,C,D,E 87 2019-03-28 2019-04-09 0 1 2 2019-03-18 2019-04-09
    A,B,C,D,E,F,G 88 2019-04-10 2019-04-15 1 2 1 2019-04-10 2019-04-15
    A,B,C,D,E 89 2019-04-16 2019-04-24 1 3 1 2019-04-16 2019-05-14
    A,B,C,D,E 90 2019-04-25 2019-05-14 0 3 2 2019-04-16 2019-05-14

    fiddle