rggplot2patchwork

patchwork makes two legend with slightly different shapes when I just want one common legend


I have this code:

library(ggplot2) library(patchwork)

set.seed(123)
top <- data.frame(
  freq_a = runif(65, 0, 5),
  freq_b = runif(65, 0, 5),
  phenotype = rep(c("A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", 'M'), each = 5),
  gene = rep(c("X", "Y", "Z"), each = 20/3, length.out = 65)
)

bottom <- data.frame(
  freq_a = runif(70, 0, 5),
  freq_b = runif(70, 0, 5),
  phenotype = rep(c("N", "O", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", 'M'), each = 5),
  gene = rep(c("X", "Y", "Z"), each = 20/3, length.out = 70)
)


p_top <- ggplot(top, aes(x=freq_a, y=freq_b, color=gene, shape=phenotype))  +geom_point()
p_top <- p_top + scale_shape_manual(values = 1:length(unique(top$phenotype)))

p_bottom <- ggplot(bottom, aes(x=freq_a, y=freq_b, color=gene, shape=phenotype))  +geom_point()
p_bottom <- p_bottom + scale_shape_manual(values = 1:length(unique(bottom$phenotype)))

combined <- p_top + p_bottom  & theme(legend.position="bottom", legend.box="vertical")
combined <- combined + plot_layout(guides = "collect") + plot_annotation(tag_levels = 'A')

which makes this:

enter image description here

The problem is that the phenotypes are slightly different between the plots, so the shapes end up being slightly different, what I want is one legend drawn only that has shapes that can be correctly used in both plots, is there any way to do this?


Solution

  • Using guides="collect" the legends only get merged when they are identical in all respects including the same range of values. To fix that you have to use the same values= and set the limits= to include all values.

    To this end I created a shape palette aka a named vector using the union of the values from both datasets.

    library(ggplot2)
    library(patchwork)
    
    pal_shape <- union(top$phenotype, bottom$phenotype)
    pal_shape <- setNames(seq_along(pal_shape), pal_shape)
    
    p_top <- ggplot(top, aes(x = freq_a, y = freq_b, color = gene, shape = phenotype)) +
      geom_point()
    
    p_bottom <- ggplot(bottom, aes(x = freq_a, y = freq_b, color = gene, shape = phenotype)) +
      geom_point()
    
    p_top + p_bottom +
      plot_layout(guides = "collect") &
      plot_annotation(tag_levels = "A") & 
      theme(legend.position = "bottom", legend.box = "vertical") &
      scale_shape_manual(values = pal_shape, limits = names(pal_shape))
    

    enter image description here