rggplot2ggridgesridgeline-plot

Stop geom_density_ridges from showing non-existent tail values


When I use geom_density_ridges(), the plot often ends up showing long tails of values that don't exist in the data.

Here's an example:

library(tidyverse)
library(ggridges)

data("lincoln_weather")

# Remove all negative values for "Minimum Temperature"
d <- lincoln_weather[lincoln_weather$`Min Temperature [F]`>=0,]

ggplot(d, aes(`Min Temperature [F]`, Month)) +
  geom_density_ridges(rel_min_height=.01)

geom_ridgline As you can see, January, February, and December all show negative temperatures, but there are no negative values in the data at all.

Of course, I can add limits to the x-axis, but that doesn't solve the problem because it just truncates the existing erroneous density.

ggplot(d, aes(`Min Temperature [F]`, Month)) +
  geom_density_ridges(rel_min_height=.01) +
  xlim(0,80)

geom_ridgeline with axis limits Now the plot makes it look like there are zero values for January and February (there are none). It also makes it look like 0 degrees happened often in December, when in reality there was only 1 such day.

How can I fix this?


Solution

  • One option is to use stat_density() instead of stat_density_ridges(). There are some things that stat_density() can't do, such as drawing vertical lines or overlaying points, but on the flip side it can do some things that stat_density_ridges() can't do, such as trimming the distributions to the data ranges.

    # Remove all negative values for "Minimum Temperature"
    d <- lincoln_weather[lincoln_weather$`Min Temperature [F]`>=0,]
    
    ggplot(d, aes(`Min Temperature [F]`, Month, group = Month, height = ..density..)) +
      geom_density_ridges(stat = "density", trim = TRUE)
    

    enter image description here

    As an alternative, you could draw a point rug, maybe that serves your purpose as well or better:

    ggplot(d, aes(`Min Temperature [F]`, Month)) +
      geom_density_ridges(rel_min_height = 0.01, jittered_points = TRUE,
                          position = position_points_jitter(width = 0.5, height = 0),
                          point_shape = "|", point_size = 2,
                          alpha = 0.7)
    

    enter image description here

    Note: those two approaches cannot currently be combined, that would require some modifications to the stat code.