rdplyrsummarize

Making list of strings while summarizing with dplyr


I have a series of dataframes, each of which contains a name column and then a text column. I'd like to find duplicates in the text, and then generate a list of all the names that are associated with the duplicate. I can get as far as getting a list of the text duplicates and the number of times each duplicate occurs, but I'm struggling to find a way to get the list of associated names. Here is a reproducible example:

#two separate data frames with name/string
books1 <- data.frame(
  name=rep("Ellie", 4),
  book= c("Anne of Green Gables", "The Secret Garden", "Alice in Wonderland", "A Little Princess"))

books2 <- data.frame(
  name=rep('Jess', 6),
  book=c("Harry Potter", "Percy Jackson", "Anne of Green Gables", "Chronicles of Narnia", "Redwall", "A Little Princess"))

#combine into single data frame
books <- bind_rows(books1, books2)

#identify repeats
repeatbooks <- books %>% group_by(book) %>% summarize(n=n())

This gives me:

  book                     n
1 A Little Princess        2
2 Alice in Wonderland      1
3 Anne of Green Gables     2
4 Chronicles of Narnia     1
5 Harry Potter             1
6 Percy Jackson            1
7 Redwall                  1
8 The Secret Garden        1

What I'd like is something like:

  book                     n     name
1 A Little Princess        2     Ellie, Jess
2 Alice in Wonderland      1     Ellie
3 Anne of Green Gables     2     Ellie, Jess

I'd hoped to do something like this, but it creates multiple rows, rather than grouping the names into a single row

#identify repeats while catching associated names - doesn't group into single column
repeatbooks <- books %>% group_by(book) %>% summarize(n=n(), names=c(paste0(name), ', '))

Solution

  • Do you mean something like below

    books %>%
      reframe(
        n = n(),
        name = toString(unique(name)),
        .by = book
      )
    

    such that

                      book n        name
    1 Anne of Green Gables 2 Ellie, Jess
    2    The Secret Garden 1       Ellie
    3  Alice in Wonderland 1       Ellie
    4    A Little Princess 2 Ellie, Jess
    5         Harry Potter 1        Jess
    6        Percy Jackson 1        Jess
    7 Chronicles of Narnia 1        Jess
    8              Redwall 1        Jess