rggplot2tidyversemissing-datalubridate

ggplot with shade for the missing months in R?


Update: Adding uncertainty bounds to the original data.frame that should be part of the plot

I've asked this question before, but I'm asking again to see if there's another method for shading missing months' data without having the lines connected.

DF <- data.frame(Date = seq(as.Date("2000-01-01"),to = as.Date("2005-12-31"),by = "day"), 
                 Obs = runif(2192, 5,10),
                 Sim = runif(2192, 8,12),
                 Up = runif(2192,13,15),
                 Low = runif(2192,1,4)) %>% 
      filter(between(month(Date),3,10))

ggplot(DF, aes(x = Date))+
  geom_ribbon(aes(ymin = Up, ymax = Low), fill = "grey30", alpha = 0.5)+
  geom_line(aes(y = Obs, col = "blue"))+
  geom_line(aes(y = Sim, col = "red"))

Output needed This is sort of what I want but without the lines connecting for missing months.

enter image description here

Update 2: A uncertainty bounds example output enter image description here


Solution

  • I would probably use the data to find the start and stop times of any periods without consecutive daily measurements:

    library(tidyverse)
    
    missing_months <- DF %>% 
      mutate(missing = c(diff(Date), 1) > 1,
             missing = missing | lag(missing)) %>%
      filter(missing) %>%
      mutate(start_stop = rep(c("start", "stop"), length = n())) %>%
      select(Date, start_stop) %>%
      pivot_wider(names_from = start_stop, values_from = Date, values_fn = list) %>%
      unnest(cols = c(start, stop)) %>%
      mutate(start = start + 1, stop = stop - 1)
    
    missing_months
    #> # A tibble: 5 x 2
    #>   start      stop      
    #>   <date>     <date>    
    #> 1 2000-11-01 2001-02-28
    #> 2 2001-11-01 2002-02-28
    #> 3 2002-11-01 2003-02-28
    #> 4 2003-11-01 2004-02-29
    #> 5 2004-11-01 2005-02-28
    

    To stop the lines joining up, you need to label each consecutive run and use this label as a grouping variable in your plot. Remember it's going to be easier to plot with your data pivoted into long format:

    DF %>%
      pivot_longer(c('Obs', 'Sim')) %>%
      group_by(name) %>%
      mutate(run = cumsum(c(0, diff(Date) > 1)), group = interaction(name, run)) %>%
      ggplot() +
      geom_rect(aes(xmin = start, xmax = stop, ymin = -Inf, ymax = Inf),
                data = missing_months, fill = "gray30", alpha = 0.5) +
      geom_ribbon(aes(group = group, x = Date, ymin = Low, ymax = Up), 
                  fill = 'gray80') +
      geom_line(aes(x = Date, y = value, color = name, group = group), 
                linewidth = 0.1) +
      scale_color_manual(NULL, values = c("red3", "blue3", "green4", "black"),
                         guide = guide_legend(override.aes = list(linewidth = 1))) +
      theme_bw(20)
    

    enter image description here