rggplot2venn-diagramupsetr

How to remove the set size plot and add percentage with the numbers on the top of the bars


I am trying to move away from the Venn plots for anything more than 3 conditions/samples in my experiments. It just become more and more difficult to interpret.

UpSet plots using UpSetR::upset() work great,but I am having to remove the Set Size plot in post production by cropping. I wish to remove it from the upset bar plots so that generated plot is ready to use in reports.

Also several of my colleagues (biologists around me who need these figures for their reports) are too used to looking at venns and they would really like to see percentages along with full number of observations in each intersection. Generally I am looking to show the overlap in the expression of various genes etc in different conditions.

Here is an example:

require(tidyverse)
require(stringr)
require(VennDiagram)
require(UpSetR)


temp_movies  <- ggplot2movies::movies



to_UpSet <- list( Action = filter( temp_movies, Action == 1)$title,
                  Comedy = filter( temp_movies, Comedy == 1)$title,
                  Drama = filter( temp_movies, Drama == 1)$title,
                  Romance = filter( temp_movies, Romance == 1)$title)


upset(fromList(to_UpSet), 
      order.by = "freq",
      set_size.show = TRUE)

It produces:

enter image description here

How can I remove the Set Size plot and add % like in the Venn Diagram attached

enter image description here

Manually cropped example is here along with % labels for two bars - need % labels on all the bars: enter image description here


Solution

  • UpSetR:::print.upset function (which does the final plotting) does not allow for such modifications.

    Good news, however, is that the whole plot is grid based, so you can draw the plot yourself rather easily from the computed parts, skipping the set size plot and modifying the labels (which boils down to finding the correct grobs):

    library(grid)
    library(gridExtra)
    
    ups <- upset(fromList(to_UpSet), 
                 order.by = "freq",
                 set_size.show = TRUE)
    
    skip_set_size_plot <- function(ups) {
      main <- ups$Main_bar
      ## find panel grob
      panel_idx <- grep("panel", main$layout$name, fixed = TRUE)
      ## find text grob
      text_idx <- which(
        vapply(main$grobs[[panel_idx]]$children, 
               \(x) inherits(x, "text"), 
               logical(1)))
      tG <- main$grobs[[panel_idx]]$children[[text_idx]]
      tG$label <- paste0(tG$label, " (",
                         scales::label_percent(0.1)(as.numeric(tG$label) / 
                                                      sum(as.numeric(tG$label))),
                         ")")
      main$grobs[[panel_idx]]$children[[text_idx]] <- tG
      grid.newpage()
      grid.draw(arrangeGrob(main, ups$Matrix, heights = ups$mb.ratio))
    }
    
    skip_set_size_plot(ups)
    

    Visualization of set intersections using novel UpSet matrix design without the barchart representing the set sizes

    As always with manual grid fiddeling, you rely on the internal structure and if the package author decides to change this for whatever reason your code may break.

    Furthermore, this code works with the easiest form of the intersections plot, but as soon as start to ask for additional elements (legend for example), you will need to adapt.