rdplyrduplicatesgroup-summaries

R - After grouping, how do I get the maximum times a value is repeated?


Say I have a dataset like this:

 id <- c(1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3)
 foo <- c('a', 'b', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'a', 'a')
 dat <- data.frame(id, foo)

I.e.,

    id  foo
 1   1   a
 2   1   b 
 3   2   a
 4   2   a
 5   2   b
 6   2   b
 7   2   b
 8   3   c
 9   3   c
10   3   a
11   3   a

For each id, how would I get the max repetition of the values of foo

I.e.,

   id  max_repeat
1   1   1
2   2   3
3   3   2

For example, id 2 has a max_repeat of 3 because one of it's values of foo (b) is repeated 3 times.


Solution

  • Using tidyverse:

    dat %>%
     group_by(id, foo) %>% #Grouping by id and foo
     tally() %>% #Calculating the count
     group_by(id) %>%
     summarise(res = max(n)) #Keeping the max count per id
    
         id   res
      <dbl> <dbl>
    1    1.    1.
    2    2.    3.
    3    3.    2.