I have a report that I create and update which has many different sites for different parameters. Not all parameters are measured at every site. I created a manual color scale so that the sites show as the same colors across plots.
The trouble is that now every value in the manual color scale is showing in the legend, regardless of it is present in the plot or not. I created a reproducible example below
library(ggplot2)
library(dplyr)
set.seed(123)
df <- tibble(
site = rep(c("front", "back", "top"), 4),
x = rnorm(12, 0, 1),
y = rnorm(12, 0, 2)
) %>%
mutate(site = as.factor(site))
my_colors <- c("red", "blue", "green")
names(my_colors) <- levels(df$site)
col_scale <- scale_colour_manual(name = "site", values = my_colors)
ggplot(df, aes(x = x, y = y, color = site)) +
theme_bw() +
geom_point() +
col_scale
df %>% filter(site != "top") %>%
ggplot(aes(x = x, y = y, color = site)) +
theme_bw() +
geom_point() +
col_scale
Which creates the following plots:
I want for only the sites that present in the plot to show in the legend. Any help would be much appreciated! Thanks!
This is a recent bug in ggplot2, see this issue. A temporary workaround might be to specify limits = force
, or any other function that returns the input verbatim.
library(ggplot2)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
set.seed(123)
df <- tibble(
site = rep(c("front", "back", "top"), 4),
x = rnorm(12, 0, 1),
y = rnorm(12, 0, 2)
) %>%
mutate(site = as.factor(site))
my_colors <- c("red", "blue", "green")
names(my_colors) <- levels(df$site)
col_scale <- scale_colour_manual(name = "site", values = my_colors,
limits = force)
ggplot(df, aes(x = x, y = y, color = site)) +
theme_bw() +
geom_point() +
col_scale
df %>% filter(site != "top") %>%
ggplot(aes(x = x, y = y, color = site)) +
theme_bw() +
geom_point() +
col_scale
Created on 2021-06-22 by the reprex package (v1.0.0)