rggplot2scale-color-manual

ggplot2 scale_color_manual showing all values in legend


I have a report that I create and update which has many different sites for different parameters. Not all parameters are measured at every site. I created a manual color scale so that the sites show as the same colors across plots.

The trouble is that now every value in the manual color scale is showing in the legend, regardless of it is present in the plot or not. I created a reproducible example below

library(ggplot2)
library(dplyr)

set.seed(123)

df <- tibble(
  site = rep(c("front", "back", "top"), 4),
  x = rnorm(12, 0, 1),
  y = rnorm(12, 0, 2)
) %>% 
  mutate(site = as.factor(site))

my_colors <- c("red", "blue", "green")

names(my_colors) <- levels(df$site)

col_scale <- scale_colour_manual(name = "site", values = my_colors)

ggplot(df, aes(x = x, y = y, color = site)) + 
  theme_bw() + 
  geom_point() + 
  col_scale

df %>% filter(site != "top") %>% 
  ggplot(aes(x = x, y = y, color = site)) + 
  theme_bw() + 
  geom_point() + 
  col_scale

Which creates the following plots:

plot with three sites

plot with two sites

I want for only the sites that present in the plot to show in the legend. Any help would be much appreciated! Thanks!


Solution

  • This is a recent bug in ggplot2, see this issue. A temporary workaround might be to specify limits = force, or any other function that returns the input verbatim.

    library(ggplot2)
    library(dplyr)
    #> 
    #> Attaching package: 'dplyr'
    #> The following objects are masked from 'package:stats':
    #> 
    #>     filter, lag
    #> The following objects are masked from 'package:base':
    #> 
    #>     intersect, setdiff, setequal, union
    
    set.seed(123)
    
    df <- tibble(
      site = rep(c("front", "back", "top"), 4),
      x = rnorm(12, 0, 1),
      y = rnorm(12, 0, 2)
    ) %>% 
      mutate(site = as.factor(site))
    
    my_colors <- c("red", "blue", "green")
    
    names(my_colors) <- levels(df$site)
    
    col_scale <- scale_colour_manual(name = "site", values = my_colors,
                                     limits = force)
    
    ggplot(df, aes(x = x, y = y, color = site)) + 
      theme_bw() + 
      geom_point() + 
      col_scale
    

    df %>% filter(site != "top") %>% 
      ggplot(aes(x = x, y = y, color = site)) + 
      theme_bw() + 
      geom_point() + 
      col_scale
    

    Created on 2021-06-22 by the reprex package (v1.0.0)