I would like to create a histogram in ggplot2
where each bin has the same number of points and all bins have the same area. In base R we could do the following:
set.seed(123)
x <- rnorm(100)
hist(x, breaks = quantile(x, 0:10 / 10))
When I try this in ggplot
with scale_x_continuous
and setting breaks like hist
it returns the following:
library(ggplot2)
ggplot(data = data.frame(x), aes(x = x)) +
geom_histogram(aes(y = after_stat(density)), bins = 10) +
scale_x_continuous(breaks=quantile(x, 0:10 / 10))
Created on 2023-01-05 with reprex v2.0.2
Why is this returning a different output? So I was wondering if anyone knows how to create a histogram with equal area bins using ggplot
like in the base option above?
If you wanted the same output as the base R hist, you can just extract the values from the object and draw it yourself.
set.seed(123)
x <- rnorm(100)
hh <- hist(x, breaks = quantile(x, 0:10 / 10))
data.frame(
left=head(hh$breaks,-1), right=tail(hh$breaks, -1),
height=hh$density
) |>
ggplot() +
aes(xmin=left, xmax=right, ymin=0, ymax=height) +
geom_rect(fill="lightgray", color="black")