rdataframegroup-bycumulative-sum

calculate cumulative sum over month and by group as well


I first calculated monthly sum by counting IDs within a group using group-by and mutate. However, as I try to use cumsum to calculate monthly cumulative sum by group using cumsum, the data just does not look right. It would be really helpful if someone can help me. Thanks~

Assuming original data frame looks like so:

person id company yyyy month
1 A 2011 January
2 A 2011 January
3 A 2011 Feburary
4 A 2011 Feburary
5 A 2011 Feburary
6 A 2011 March
7 B 2011 January
8 B 2011 January
9 B 2011 January
10 B 2011 Feburary
11 B 2011 Feburary
12 B 2011 Feburary
13 B 2011 Feburary
14 B 2011 March
15 B 2011 March
16 B 2011 April

I would like to have a final output data frame like the one below.

person id company yyyy month monthly sum cumulative monthly sum
1 A 2011 January 2 2
2 A 2011 January 2 2
3 A 2011 Feburary 3 5
4 A 2011 Feburary 3 5
5 A 2011 Feburary 3 5
6 A 2011 March 1 6
7 B 2011 January 3 3
8 B 2011 January 3 3
9 B 2011 January 3 3
10 B 2011 Feburary 4 7
11 B 2011 Feburary 4 7
12 B 2011 Feburary 4 7
13 B 2011 Feburary 4 7
14 B 2011 March 2 9
15 B 2011 March 2 9
16 B 2011 April 1 10

Solution

  • Only summing the first group value

    library(dplyr)
    
    df %>% 
      mutate(monthlySum = n_distinct(person.id), 
             cumulativeSum = row_number() == 1, .by = c(company, yyyy, month)) %>% 
      mutate(cumulativeSum = cumsum(if_else(cumulativeSum, monthlySum, 0)), 
        .by =c(company, yyyy))
       person.id company yyyy    month monthlySum cumulativeSum
    1          1       A 2011  January          2             2
    2          2       A 2011  January          2             2
    3          3       A 2011 Feburary          3             5
    4          4       A 2011 Feburary          3             5
    5          5       A 2011 Feburary          3             5
    6          6       A 2011    March          1             6
    7          7       B 2011  January          3             3
    8          8       B 2011  January          3             3
    9          9       B 2011  January          3             3
    10        10       B 2011 Feburary          4             7
    11        11       B 2011 Feburary          4             7
    12        12       B 2011 Feburary          4             7
    13        13       B 2011 Feburary          4             7
    14        14       B 2011    March          2             9
    15        15       B 2011    March          2             9
    16        16       B 2011    April          1            10