Say I have a data frame like myiris
below, where I want to just highlight the setosa
species.
I don't want, however, the other species to show in the legend. For my convenience, I just made all the rest be NA in a new Highlight
column.
I do the following:
data(iris)
library(ggplot2)
myiris <- data.frame(iris$Sepal.Length, iris$Petal.Length, Highlight=as.character(iris$Species))
names(myiris)[1:2] <- c('Sepal.Length', 'Petal.Length')
myiris$Highlight[myiris$Highlight!="setosa"] <- NA
myiris$Highlight <- factor(myiris$Highlight, levels="setosa")
plot_palette <- c("red","gray70")
P <- ggplot(myiris, aes(x=Sepal.Length, y=Petal.Length, color=Highlight)) +
geom_point(pch=16, size=5, alpha=0.5) +
scale_color_manual(values=plot_palette, breaks='setosa')
P
This produces the following plot, which is great and already what I expect;
However, I would like the point shape as a function of Highlight
as well, with setosa
points filled, and NA points hollow.
I use scale_shape_manual
in the same exact way I just used scale_color_manual
:
P <- ggplot(myiris, aes(x=Sepal.Length, y=Petal.Length, color=Highlight, shape=Highlight)) +
geom_point(size=5, alpha=0.5) +
scale_color_manual(values=plot_palette, breaks='setosa') +
scale_shape_manual(values=c(16,1), breaks='setosa')
However, I get:
Warning message: Removed 100 rows containing missing values or values outside the scale range (
geom_point()
).
And the plot produced is this:
Why is the scale_shape_manual
behavior different from its counterpart functions, and how to correct this to obtain what I need (color and shape as a function of Highlight
with no NA group in the legend)?
EDIT
NAs in the Highlight
column are indeed not the problem. You could try to accomplish the same using the original iris
(instead of myiris
) and the Species
column (instead of Highlight
), but the same problem occurs:
P <- ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, color=Species)) +
geom_point(pch=16, size=5, alpha=0.5) +
scale_color_manual(values=c('red','gray70','gray70'), breaks='setosa')
AND
P <- ggplot(iris, aes(x=Sepal.Length, y=Petal.Length, color=Species, shape=Species)) +
geom_point(size=5, alpha=0.5) +
scale_color_manual(values=c('red','gray70','gray70'), breaks='setosa') +
scale_shape_manual(values=c(16,1,1), breaks='setosa')
Basically scale_color_manual
and scale_shape_manual
work the same, i.e. in both cases will ggplot2
assign the na.value=
to categories excluded from breaks=
(due to the use of an unnamed vector of values=
). In case of scale_color_manual
the default na.value
is "grey50"
(the difference compared to "grey70"
is hardly visible but you can see it using layer_data()
) whereas it is NA
in case of scale_shape_manual
.
Hence, one fix for your issue would be to explicitly set the na.value=
:
library(ggplot2)
ggplot(iris, aes(x = Sepal.Length, y = Petal.Length, color = Species, shape = Species)) +
geom_point(size = 5, alpha = 0.5) +
scale_color_manual(
values = c("red", "gray70", "gray70"),
breaks = "setosa",
na.value = "gray70" # The default is "grey50"
) +
scale_shape_manual(
values = c(16, 1, 1),
breaks = "setosa",
na.value = 1
)