rif-statementggplot2fillgeom-ribbon

geom_ribbon conditional fill creating shapes that don't match data (ggplot2 r)


I am using ggplot2 to visualized model results.

I have a model results in the results object, a dataframe. My code to visualize it looks like the following:

results |> 
  mutate(Confidence = if_else((CI_high < 0 & CI_low < 0) | (CI_high > 0 & CI_low > 0),"Significant","Not Significant")) |> 
  ggplot(aes(x=time,y=Coefficient,ymin=CI_low,ymax=CI_high)) + geom_line() + geom_ribbon(alpha=0.3) 


This works fine, and produces the following plot:

plo1

However, I want to make the fill of the geom_ribbon conditional on significance, as the above if_else condition shows. But when I plot it using the following code:

results |> 
  mutate(Confidence = if_else((CI_high < 0 & CI_low < 0) | (CI_high > 0 & CI_low > 0),"Significant","Not Significant")) |> 
  ggplot(aes(x=time,y=Coefficient,ymin=CI_low,ymax=CI_high,color=Confidence,fill=Confidence)) + geom_line() + geom_ribbon(alpha=0.3) 

I get this plot: plot2

This is wrong to me. I should just have the same geom_ribbon shaded different colors within the bounds it already has (when the upper and lower bounds are either both above or both below 0). Yet now it plots an additional fill over the already shaded region, and the edges of the blue fill do not even match the smooth geom_ribbon from before. I have tried supplying only the fill argument, not the color argument and vice versa. I have also tried using the geom_ribbon's aes instead of the overall plot, but none of these attempts have resolved the problem.

How can I fix this to conditionally fill/color the plot only within the actual boundaries of the data?

EDIT (w/ Reprex)

Here is a reproducible example that also demonstrates the issue

library(modelbased)
library(tidyverse)

gam1 <- mgcv::gam(mpg ~ cyl +
                          s(disp), data = mtcars, method = "REML")


deriv1 <- modelbased::estimate_slopes(gam1,
  trend = "disp",
  at = "disp",
  length = 100) |> 
  ggplot(aes(x=disp,y=Coefficient,ymin=CI_low,ymax=CI_high)) + geom_line() + geom_ribbon(alpha=0.3)

deriv2 <- modelbased::estimate_slopes(gam1,
  trend = "disp",
  at = "disp",
  length = 100) |> mutate(Confidence = if_else(CI_high < 0 & CI_low < 0 | CI_high > 0 & CI_low > 0,"Significant","Not Significant")) |> 
  ggplot(aes(x=disp,y=Coefficient,ymin=CI_low,ymax=CI_high)) + geom_line() + geom_ribbon(alpha=0.3,aes(color=Confidence,fill=Confidence)) + scale_color_manual(values=c("red","grey"),breaks = c("Significant","Not Significant")) + scale_fill_manual(values=c("red","grey"),breaks = c("Significant","Not Significant"))

deriv1
deriv2


plot3

deriv1 shows that only the very start of the geom_ribbon and a portion before the very end should be shaded red, as only these sections do not overlap with 0. However, when the shading is made conditional, the output for deriv2 is the following:

plot4

which does not match the desired output at all. The desired output should be that only the first section and a portion before the very end (where the ymin and ymax don't overlap with 0) should be shaded red. It should not be a separate ribbon from the grey ribbon.


Solution

  • It looks odd because it doesn't know the significant part on the left isn't connected to the significant part on the right. You can add an explicit grouping variable to know what sections should be drawn together

    modelbased::estimate_slopes(gam1,
                                          trend = "disp",
                                          at = "disp",
                                          length = 100) |> 
      mutate(Confidence = if_else(CI_high < 0 & CI_low < 0 | CI_high > 0 & CI_low > 0,"Significant","Not Significant")) |> 
      mutate(Group = consecutive_id(Confidence)) |> 
      ggplot(aes(x=disp,y=Coefficient,ymin=CI_low,ymax=CI_high)) +
      geom_line() + 
      geom_ribbon(alpha=0.3,aes(color=Confidence,fill=Confidence, group=Group)) + 
      scale_color_manual(values=c("red","grey"),breaks = c("Significant","Not Significant")) + 
      scale_fill_manual(values=c("red","grey"),breaks = c("Significant","Not Significant"))
    

    enter image description here There are some discontinuities when transitioning between the two groups. I guess you'd have to decide how you want to color those regions.