sqlsql-servertriggerslagcumulative-frequency

Resetting Cumulative Figures when a new set of data appears


I have this table (minus the cuml column):


¦  Name  ¦¦  website  ¦¦   page  ¦¦fruit type¦¦year week¦¦platform¦¦totalviews¦¦cuml¦
¦avocado ¦¦avocado.com¦¦aboutpage¦¦  sugar   ¦¦ 2001-08 ¦¦ mobile ¦¦     18   ¦¦ 18 ¦
¦avocado ¦¦avocado.com¦¦homepage ¦¦  sugar   ¦¦ 2001-08 ¦¦ desktop¦¦     10   ¦¦ 10 ¦
¦avocado ¦¦avocado.com¦¦homepage ¦¦  sugar   ¦¦ 2001-09 ¦¦ desktop¦¦     12   ¦¦ 22 ¦
¦avocado ¦¦avocado.com¦¦homepage ¦¦  sugar   ¦¦ 2001-10 ¦¦ desktop¦¦     6    ¦¦ 28 ¦
¦banana  ¦¦banana.com ¦¦aboutpage¦¦  fat     ¦¦ 2001-08 ¦¦tablet  ¦¦     21   ¦¦ 21 ¦
¦banana  ¦¦banana.com ¦¦contactus¦¦  fat     ¦¦ 2001-08 ¦¦tablet  ¦¦     14   ¦¦ 14 ¦
¦banana  ¦¦banana.com ¦¦homepage ¦¦  fat     ¦¦ 2001-08 ¦¦desktop ¦¦     15   ¦¦ 15 ¦
¦oranges ¦¦oranges.com¦¦aboutpage¦¦  sugar   ¦¦ 2001-09 ¦¦tablet  ¦¦     23   ¦¦ 23 ¦
¦oranges ¦¦oranges.com¦¦aboutpage¦¦  sugar   ¦¦ 2001-10 ¦¦tablet  ¦¦     15   ¦¦ 38 ¦
¦oranges ¦¦oranges.com¦¦contactus¦¦  sugar   ¦¦ 2001-08 ¦¦desktop ¦¦     6    ¦¦ 6  ¦

What I want to do is return the same table but with cuml column at the end this time. I have tried this…


SELECT 

  [NAME]
, [WEBSITE]
, [PAGE]
, [FRUIT TYPE]
, [YEAR WEEK]
, [PLATFORM]
, [TOTALVIEWS]
, SUM(TOTALVIEWS) OVER(ORDER BY [REPORTING ISO YEAR WEEK] ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS CUML

FROM WEBVIEWSFORFRUITS

------if i place this WHERE clause in the statement the cuml column works...
--where [NAME] = 'AVACADO' and [PLATFORM] = 'DESKTOP'
------but i would like to this without the where clause...

ORDER BY [NAME], [WEBSITE], [PAGE], 
[FRUIT TYPE], [PLATFORM], [REPORTING ISO YEAR WEEK]

Which is good however as you can see from my table the other columns pose a bigger challenge. How do I work out the cumulative frequency for every row that has the same name, website, page, fruit type, platform with the only difference is the increment in year-week, when it hits a different set of names, websites, page, etc etc i would need the cuml column to reset to whatever the total views is for that change and carry on calculating the cuml until it hits a new set of data etc and needs to reset. so in this example row 2 (avocado.com) has cuml. up to 28 (row 4), then resets to totalviews of 21 when new data/weeks occurs and keeps resetting until it hits row 8 and 9 where it starts adding the previous row into the cuml (23+15 = 38)...it then resets to 6 as its new data etc.

I'm not entirely sure what I can do to solve this.

I'm thinking of a lag function? mixed with some sort of trigger statement?


Solution

  • I think using Partition By inside Over clause will create required output -

    declare @xyz table (
        Name varchar(50),
        website  varchar(50),
        page  varchar(50),
        fruittype varchar(50),
        yearweek varchar(50),
        platform varchar(50),
        totalviews int
    )
    
    insert into @xyz
    select 'avocado' ,'avocado.com','aboutpage',  'sugar'   , '2001-08' ,'mobile'  ,18 union all
    select 'avocado' ,'avocado.com','homepage' ,  'sugar'   , '2001-08' ,'desktop' ,10 union all
    select 'avocado' ,'avocado.com','homepage' ,  'sugar'   , '2001-09' ,'desktop' ,12 union all
    select 'avocado' ,'avocado.com','homepage' ,  'sugar'   , '2001-10' ,'desktop' ,6  union all
    select 'banana'  ,'banana.com' ,'aboutpage',  'fat'     , '2001-08' ,'tablet'  ,21 union all
    select 'banana'  ,'banana.com' ,'contactus',  'fat'     , '2001-08' ,'tablet'  ,14 union all
    select 'banana'  ,'banana.com' ,'homepage' ,  'fat'     , '2001-08' ,'desktop' ,15 union all
    select 'oranges' ,'oranges.com','aboutpage',  'sugar'   , '2001-09' ,'tablet'  ,23 union all
    select 'oranges' ,'oranges.com','aboutpage',  'sugar'   , '2001-10' ,'tablet'  ,15 union all
    select 'oranges' ,'oranges.com','contactus',  'sugar'   , '2001-08' ,'desktop' ,6
    
    select *,
        sum(totalviews) over (partition by name, website, page, fruittype, platform order by yearweek rows between unbounded preceding and current row)
    from @xyz