rggplot2geom-ribbon

Plot geom_ribbon with gaps and coloured groups?


(EDIT: reprex added)

I'm trying to plot both geom_line and geom_ribbon on data containing NA gaps, which is also coloured by some grouping variable grp using fill=grp. I'd like both line and ribbon to leave gaps for the NAs rather than interpolating.

Example data:

> dat <- tibble(datevar=c(1:10), y_var=c(1:10), y_min=c(1:10)*0.8, y_max=c(1:10)*1.2, grp=c(rep("A", 5), rep("B", 5)))
> dat[3,2:5] <- NA
> dat[8,2:5] <- NA

> dat
# A tibble: 10 × 5
   datevar y_var y_min y_max grp  
     <int> <int> <dbl> <dbl> <chr>
 1       1     1   0.8   1.2 A    
 2       2     2   1.6   2.4 A    
 3       3    NA  NA    NA   NA   
 4       4     4   3.2   4.8 A    
 5       5     5   4     6   A    
 6       6     6   4.8   7.2 B    
 7       7     7   5.6   8.4 B    
 8       8    NA  NA    NA   NA   
 9       9     9   7.2  10.8 B    
10      10    10   8    12   B    

For the geom_line I found this answer that works great - just need to add group=1, e.g. geom_line(aes(x=datevar, y=y_var, col=grp, group=1)).

p <- ggplot(dat) + geom_line(aes(x=datevar, y=y_var, col=grp, group=1))
p

enter image description here

So now I get this, where the line is no longer interpolated. But if I add a ribbon, it still is:

p + geom_ribbon(aes(x=datevar, ymin=y_min, ymax=y_max, fill=grp), alpha=0.3)

enter image description here

I'm wondering if there's an equivalent solution for the geom_ribbon? When I try it with just group=1 without fill=grp it correctly leaves the gaps but doesn't give me colour for the ribbon:

p + geom_ribbon(aes(x=datevar, ymin=y_min, ymax=y_max, group=1), alpha=0.3)

enter image description here

When I try to include both, e.g. geom_ribbon(aes(x=datevar, ymin=y_min, ymax=y_max, fill=grp, group=1), alpha=0.3), it gives me the following error:

Error in `geom_ribbon()`:
! Problem while converting geom to grob.
ℹ Error occurred in the 2nd layer.
Caused by error in `draw_group()`:
! Aesthetics can not vary along a ribbon
Run `rlang::last_trace()` to see where the error occurred.

Does anyone know a simple solution?


Solution

  • One option is to sort by your date variable and then create a column of consecutive_id on the grp column. You can then color and fill by grp, but use the consecutive ID column to group the individual "pieces". Note this fixes the "extra" line that is drawn through the missing data as pointed out by r2evans.

    library(tidyverse)
    
    dat %>%
      arrange(datevar) %>%
      mutate(id = consecutive_id(grp)) %>%
      ggplot(aes(datevar, y_var, group = id)) +
      geom_ribbon(aes(ymin = y_min, ymax = y_max, fill = grp), alpha = 0.2) +
      geom_line(aes(color = grp)) +
      scale_fill_manual(NULL, values = c("turquoise4", "orangered"), 
                        breaks = c("A", "B")) +
      scale_color_manual(NULL, values = c("turquoise4", "orangered"),
                         breaks = c("A", "B")) +
      theme_minimal(base_size = 16)
    

    enter image description here

    You will get some warnings about ggplot being unable to draw the empty groups, but you can either filter out missing data by adding filter(complete.cases(.)) to the pipe, or simply ignore the warnings.