rggplot2visualizationbinning

Inaccurate mapping of gradient fill colours with bin counts in `geom_hexbin` in `ggplot2`


I am trying to plot a binned scatter plot as below using ggplot2.

library(ggplot2)
bks = seq(from = 0, to = 10000, by = 1000)
d <- ggplot(diamonds, aes(carat, price)) + theme_bw()
d + geom_point(alpha = 0.01)

image 1

When I use geom_hexbin, the counts in the bins are not accurately mapped to the gradient scale.

d + geom_hex(aes(fill = after_stat(count)), bins = 30, colour = "white") + 
  scale_fill_distiller(palette = "Spectral", breaks = bks) +
  geom_text(data = diamonds, aes(x = carat, y = price, label = after_stat(count)),
          stat="binhex", bins=30, show.legend=FALSE,
          colour="black", size=2.5)

image 2

For example bins with 5809 and 5556 counts are still shown as blue in colour.

However, with geom_bin_2d, the mapping seems to be accurate

d + geom_bin_2d(aes(fill = after_stat(count)), bins = 30) + 
  scale_fill_distiller(palette = "Spectral", breaks = bks) +
  geom_text(data = diamonds, aes(x = carat, y = price, label = after_stat(count)),
            stat="bin_2d", bins=30, show.legend=FALSE,
            colour="black", size=2.5)

image 3

What is going wrong here ? How to get accurate mapping of hex bin counts with fill gradient colours in geom_hexbin in ggplot2 ?


Solution

  • Update

    The latest version on CRAN, ggplot2 3.4.1, should have this bug fixed.

    Old answer

    This is a known bug that is fixed in the development version of ggplot2. You can use the command below to install the latest development version of ggplot2:

    devtools::install_github("tidyverse/ggplot2")
    

    For relevant discussion, see https://github.com/tidyverse/ggplot2/issues/504