rggplot2upsetrupsetplot

R ggplot ggupset - Create inset with combinations that have fewer intersection


I am looking for a way to subset my input data so I can make a second upsetR plot that shows the resolution of the sample intersections that are <<100 (for example). As an example, I'm using the tidy_movies data from tidyverse and the ggupset documentation (https://github.com/const-ae/ggupset).

I've posted a 'photoshopped' version of the figure that I need to make.

library(tidyverse)
library(ggupset)

#tidy_movies

tidy_movies %>%
  distinct(title, year, length, .keep_all=TRUE) %>%
  ggplot(aes(x=Genres)) +
  geom_bar() +
  scale_x_upset(n_intersections = 20)
  # + scale_x_continuous(limits = c(0,100)) ##This does not work when uncommented.

Ideally, want a figure that looks like this:

examplefigure2

Another approach would be to figure out how to subset tidy_movies

class(tidy_movies)
# How could I create a new version of tidy_movies that isolates a specific set of combinations?

thoughts? suggestions?


Solution

  • You could create a new column comprising the pasted-together contents of the Genre list column, group_by this, and filter out any groups with n() > 100:

    library(tidyverse)
    library(ggupset)
    
    tidy_movies %>%
      distinct(title, year, length, .keep_all = TRUE) %>%
      mutate(gen = sapply(Genres, paste, collapse = " ")) %>%
      group_by(gen) %>%
      filter(n() < 100) %>%
      ggplot(aes(x=Genres)) +
      geom_bar() +
      scale_y_continuous(limits = c(0, 200)) +
      scale_x_upset(n_intersections = 8)
    

    enter image description here