Attaching labels estimating proportion of posterior density within an interval in the ggdist package

I am trying to create a graph plotting distributions of estimated difference between several groups using the ggdist() package in R. The help has been useful but it does not cover anywhere near the breadth of functionality possible with the package. Here is an example analysis using the iris dataset in R and brms, examining the association between Species (three-level categorical) and Sepal Length (also three-level categorical, created for this analysis from a continuous variable) on Sepal Width, a numeric variable.

library(tidyverse)
library(brms)
library(ggdist)
library(tidybayes)
library(marginaleffects)
library(modelr)

#  create dataset
iris %>%
  mutate(SepalLength_cat = factor(case_when(Sepal.Length >= 4.3 & Sepal.Length < 5 ~ "cat1",
                                            Sepal.Length >= 5.1 & Sepal.Length < 6.4 ~ "cat2",
                                            TRUE ~ "cat3"))) -> irisAlt

# run model
mod <- brm(formula = Sepal.Width ~ Species*SepalLength_cat,
           data = irisAlt)

# sample from the posterior distribution at each level of the two three-level variables (so nine cells)
data_grid(data = irisAlt,
          Species = c("setosa", "versicolor", "virginica"),
          SepalLength_cat = c("cat1", "cat2", "cat3")) %>%
  add_epred_draws(mod,
                  re_formula = NA,
                  seed = 1234) -> modPred

Now using the compare_levels() function from tidybayes we compare estimated difference in Sepal Width between levels of Sepal Length at each level of Species.

# compare levels
modPred %>%
  compare_levels(variable = .epred,
                 by = SepalLength_cat) %>%
    mutate(SepalLength_cat = factor(x = SepalLength_cat,
                                    levels = c("cat2 - cat1",
                                               "cat3 - cat1",
                                               "cat3 - cat2")),
           Species = factor(Species)) -> compCat

What I want to do is first plot the distribution, shading the 95% HDI, adding a point interval below also showing the HDI and adding spikes. What I also want to include is

(a) an interval with a ROPE between -0.25 and 0.25

(b) shading the distribution falling within that ROPE a different colour.

Here is my attempt

compCat %>%
  ggplot(mapping = aes(x = .epred,
                       y = SepalLength_cat)) +
         stat_histinterval(mapping = aes(fill = after_stat(level)),
                           point_interval = median_hdi,
                           .width = c(.95)) +
         stat_spike(mapping = aes(linetype = after_stat(at)), 
                    at = c("median",
                           "hdi")) +
         stat_spike(mapping = aes(linetype = after_stat(at)), 
                    at = c(-0.25, 0.25),
                    colour = "red") +
         scale_fill_manual(values = "skyblue",
                           na.value = "gray75") +
         scale_thickness_shared() + # makes sure spike same size as distribution
         facet_grid(.~Species) + 
         labs(y = "Sepal Length comparison",
              x = "Sepal Width (in mm)") +
         theme_minimal() +
         theme(legend.position = "none")

Which should come out looking like this

I used spikes for the ROPE instead of an interval, but I would really like to use an interval and especially to shade the area of the distribution that falls between -0.25 and 0.25, and annotate with labels indicating proportion of density within the ROPE. One of the problems is that stat_interval() and its related functions don't seem to accept a fixed cartesian interval, instead calculating HDI, which moves around depending on the distribution of estimates.

I don't know if ggdist contains the necessary functionality to do this, but if it does, or if anyone can suggest another way of achieving my aims, I would appreciate it.

Solution

I'm not sure how to do this within ggdist (or if it can be done).⁺ So, here's a way to construct the plot by directly calculating the desired density values from the posterior draws and then creating a custom ggplot.

The overlapping shading of the ROPE and HDI might be confusing, but hopefully you can play around with the code to get something that meets your needs.*

# Calculate HDIs
hdi = compCat %>% 
  group_by(Species, SepalLength_cat) %>% 
  summarise(as_tibble(hdi(.epred))) %>%
  rename(hdi.lwr=V1, hdi.upr=V2)

# Calculate full-range densities
adj = 0.5
dens.hist = compCat %>% 
  group_by(Species, SepalLength_cat) %>% 
  group_split() %>% 
  map_df(~{
    dens = density_bounded(.x$.epred, adjust=adj)
    tibble(Species=.x$Species[1], SepalLength_cat=.x$SepalLength_cat[1],
           x=dens$x, .epred=dens$y)
  }) %>% 
  left_join(hdi)

# Calculate ROPE densities
rope.rng = c(-0.25, 0.25)

rope.dens = dens.hist %>% 
  filter(x >= min(rope.rng), x <= max(rope.rng)) %>% 
  group_by(Species, SepalLength_cat) %>% 
  summarise(rope.dens = sum(mean(diff(x)) * .epred))

rope.color = hcl(0, 80, 30, alpha=c(1, 0.7))
hdi.color = hcl(240, 100, 30, alpha=c(1,0.5))

theme_set(theme_minimal(base_size=14))

dens.hist %>% 
  ggplot(aes(x, .epred)) +
    geom_hline(yintercept=0, linewidth=0.1, colour="black") +
    # Overall densities
    geom_line(linewidth=0.3) +
    # Shade ROPE
    geom_area(data=. %>% 
                filter(x >= min(rope.rng), x <= max(rope.rng)),
              aes(fill="ROPE")) +
    # Shade HDI
    geom_area(data=. %>% 
                filter(x >= hdi.lwr, x <= hdi.upr),
              aes(fill="HDI")) +
    # HDI segment
    geom_segment(data=hdi, 
                 aes(x=hdi.lwr, xend=hdi.upr, y=-0.1, yend=-0.1,
                     colour="HDI")) +
    # ROPE segment
    geom_line(data=. %>% 
                group_by(Species, SepalLength_cat) %>% 
                reframe(x=rope.rng, .epred=-0.2),
              aes(colour="ROPE")) +
    # ROPE density value
    geom_text(data=rope.dens, 
              size=3.5, vjust=0.5,
              aes(label=paste0(round(rope.dens*100, 1),"%"), 
                  x = mean(rope.rng), 
                  y = -0.4,
                  colour="ROPE")) +
    facet_grid(cols=vars(Species), row=vars(fct_rev(SepalLength_cat)), switch="y") +
    theme(
      strip.text.y.left=element_text(angle=0, vjust=0.02),
      axis.text.y=element_blank(),
      axis.ticks.y=element_blank()
    ) +
    scale_y_continuous(expand=c(0.03, 0.03)) +
    scale_fill_manual(values=c(ROPE=rope.color[2], HDI=hdi.color[2])) +
    scale_colour_manual(values=c(ROPE=rope.color[1], HDI=hdi.color[1])) +
    guides(colour="none") +
    labs(y="Sepal Length comparison", x="Sepal Width", fill=NULL)

+ Edit: FYI, Matt Kay, the author of the ggdist package, added an answer in the Stan forum showing how to create this plot with ggdist functions.

* After not receiving an answer for a few days here, this question was also cross-posted to the Stan Community, so I've also posted this answer there.