rggplot2ggridges

Picking joint bandwidth of NaN within ggridge


I am trying to generate ridge plot like the one described here but error "Picking joint bandwidth of NaN" keeps showing up. What's wrong with this? Thankful for any pointer, tip. Best

toplot = structure(list(Year = c("2000", "2000", "2001", "2001", "2002", 
"2002", "2003", "2003", "2004", "2004", "2005", "2005", "2006", 
"2006", "2007", "2007", "2008", "2008", "2009", "2009", "2010", 
"2010", "2011", "2011", "2012", "2012", "2013", "2013", "2014", 
"2014", "2015", "2015", "2016", "2016", "2017", "2017", "2018", 
"2018", "2019", "2019", "2020", "2020", "2021", "2021"), genes = c("DAO", 
"IDH2", "DAO", "IDH2", "DAO", "IDH2", "DAO", "IDH2", "DAO", "IDH2", 
"DAO", "IDH2", "DAO", "IDH2", "DAO", "IDH2", "DAO", "IDH2", "DAO", 
"IDH2", "DAO", "IDH2", "DAO", "IDH2", "DAO", "IDH2", "DAO", "IDH2", 
"DAO", "IDH2", "DAO", "IDH2", "DAO", "IDH2", "DAO", "IDH2", "DAO", 
"IDH2", "DAO", "IDH2", "DAO", "IDH2", "DAO", "IDH2"), n = c(2L, 
0L, 2L, 0L, 2L, 0L, 3L, 0L, 5L, 0L, 5L, 0L, 4L, 0L, 6L, 0L, 2L, 
0L, 4L, 0L, 13L, 0L, 7L, 0L, 7L, 0L, 169L, 1L, 182L, 0L, 215L, 
56L, 147L, 11L, 165L, 115L, 10L, 62L, 13L, 74L, 14L, 59L, 67L, 
44L)), row.names = c(NA, -44L), class = c("tbl_df", "tbl", "data.frame"
))


toplot %>%
  mutate(YearFct = fct_rev(as.factor(Year))) %>%
  ggplot(aes(y = YearFct)) +
  geom_density_ridges(
    aes(x = n, fill = paste(YearFct, genes)), 
    alpha = .8
  ) +
  labs(
    x = "No_Patent",
    y = "Year"
  ) +
  scale_y_discrete(expand = c(0, 0)) +
  scale_x_continuous(expand = c(0, 0)) +
  coord_cartesian(clip = "off") +
  theme_ridges(grid = FALSE)

Solution

  • Your code is fine. You just don't have enough data for this kind of plot. You only have a single measurement for each gene in each year. You are therefore trying to create a density estimate based on a single point each year, which doesn't work. You need at least two points each year to get an automatic bandwidth selection.

    If you simulate plentiful data, you will see your code works well enough.

    set.seed(1)
    
    toplot <- data.frame(Year = rep(2000:2021, each = 20),
                         genes = rep(c("DAO", "IDH2"), 220),
                         n = round(rexp(440, rep(c(0.9, 0.05), 220))))
    
    library(tidyverse)
    library(ggridges)
    
    toplot %>%
      mutate(YearFct = fct_rev(as.factor(Year))) %>%
      ggplot(aes(y = YearFct)) +
      geom_density_ridges(
        aes(x = n, fill = genes), 
        alpha = .8
      ) +
      labs(
        x = "No_Patent",
        y = "Year"
      ) +
      scale_y_discrete(expand = c(0, 0)) +
      scale_x_continuous(expand = c(0, 0)) +
      coord_cartesian(clip = "off") +
      theme_ridges(grid = FALSE)
    #> Picking joint bandwidth of 3.71
    

    One possible solution is to create your own density curves using dnorm, with n being the mean and the groupwise standard deviation being the sd. This gives you some idea of the uncertainty involved. It does at least produce a kind-of informative plot, though is probably less honest than a simple dodged bar plot, which would be the obvious way to plot this data set

    toplot %>%
      group_by(genes) %>%
      mutate(sd = sd(n)) %>%
      group_by(Year, genes) %>%
      summarise(x = seq(0, 300, length = 1000),
                dens = dnorm(x, n, sd),
                dens = dens/max(dens)) %>%
      mutate(YearFct = fct_rev(as.factor(Year))) %>%
      ggplot(aes(y = YearFct, x = x)) +
      geom_ridgeline(
        aes(height = dens, fill = genes), 
        alpha = .8
      ) +
      labs(
        x = "No_Patent",
        y = "Year"
      ) +
      scale_y_discrete(expand = c(0, 0)) +
      scale_x_continuous(expand = c(0, 0)) +
      coord_cartesian(clip = "off") +
      theme_ridges(grid = FALSE)
    

    enter image description here

    I think this is harder to interpret, less honest, and arguably less attractive than a simple line plot.

    toplot %>%
      ggplot(aes(factor(Year), n, color = genes, group = genes)) +
      geom_line(size = 1.5) +
      geom_point(size = 4, shape = 21, fill = "white") +
      scale_color_manual(values = c("deepskyblue4", "orange")) +
      labs(x = "Year") +
      theme_light(base_size = 16)
    

    enter image description here

    Created on 2022-04-24 by the reprex package (v2.0.1)