I'm creating some box plots with geom_boxplot and geom_jitter in ggplot2. For the most part, my data points are clustered around the boxes, but there are a few that aren't. I'm not removing them as outliers. When the plot is rendered, it squashes the boxes so that the y axis is scaled evenly and it shows the points at the top. What I'd like to do, is still show the points, but have the y axis distance between 1 and 3 the same as between 0 and 1 (approximately anyway). If the results were larger, I would log or square root transform, but they're small numbers. Is there a way I can make this plot?
Here's some code
dat <- data.frame (cat = "A", result = rnorm (87, 0.26, 0.19))
ggplot(dat, aes (x = cat, y = result)) +
geom_boxplot()+
geom_jitter()
Which produces
Now add in some data points further away
new_values <- data.frame(cat = "A", result = c(3.4 ,3.2))
dat <- rbind(dat, new_values)
ggplot(dat, aes (x = cat, y = result)) +
geom_boxplot()+
geom_jitter()
which produces
What I'd like to do is adjust the scale of the y axis so that the box plot isn't compressed but it still shows the other two data points. Something like this.
Any suggestions welcome. Thanks in advance
In general you can apply any transformation to a scale via the trans=
argument. When you have specific needs and it's worth the effort you can create a custom transformation. However, as first step you might consider using one of the built-in transformations, e.g. scales::transform_modulus
(a generalization of a Box-Cox transformation) seems to come close to what you have in mind:
library(ggplot2)
library(scales)
set.seed(123)
dat <- data.frame(cat = "A", result = rnorm(87, 0.26, 0.19))
new_values <- data.frame(cat = "A", result = c(3.4, 3.2))
dat <- rbind(dat, new_values)
ggplot(dat, aes(x = cat, y = result)) +
geom_boxplot(outliers = FALSE) +
geom_jitter() +
scale_y_continuous(
trans = scales::transform_modulus(-1),
breaks = c(0, .5, 1.75, 3.5)
)