I am trying to visualize my benchmarking results where for three values of the discrete hyperparameter p and two values of categorical hyperparameter q, I have 50 runtimes. I am trying to create a boxplot for each value of p on the x-axis and separate them by color based on q:
library(ggplot2)
min_ex <- data.frame(p = factor(rep(c(3,10,20), each = 2 * 50)),
q = factor(rep(c("A", "B"), each = 50)),
time = rnorm(3 * 2 * 50))
ggplot(min_ex, aes(x = p, y = time, color = q)) +
geom_boxplot()
AFAIK, I need to code p as a factor in order to group the boxplots by p. However, the x-axis is now disctete, and I want it to be continuous, i.e. I want the spacing between the x-values (3,10,20) to be on the actual continuous scale.
Specifying a continuous scale using scale_x_continuous
yields the error Discrete values supplied to continuous scale.
.
This question had a similar issue that was solved by specifying further factor levels. However, this would only work in my case if the x-values were evenly spaced. I could mitigate that by giving all values between 3 and 20 as factor levels, but then each level gets its on axis tick.
How can I specify a discrete x value on a continuous x axis scale?
A pure {ggplot2} option would be to convert your p
column to integers and to explicitly set the group
aes to group by both p
and q
:
library(ggplot2)
set.seed(123)
min_ex <- data.frame(
p = factor(rep(c(3, 10, 20), each = 2 * 50)),
q = factor(rep(c("A", "B"), each = 50)),
time = rnorm(3 * 2 * 50)
)
ggplot(
min_ex,
aes(
x = as.integer(as.character(p)),
y = time,
color = q,
group = interaction(p, q)
)
) +
geom_boxplot() +
scale_x_continuous(breaks = c(3, 10, 20))