rggplot2ggdist

Coloring dots with stat_dots by third variable changes the display of quantile dot plot


I want to display quantile dot plots for posteriors of correlation coefficients calculated between 1 outcome and 3 other variables. I want the quantile dot plot to display the posterior distributions for the three correlations separately within the same plot. Within each distribution I want to color code the dots in accordance with "rules of thumb" interpretations for magnitude-- e.g., the idea being that some correlations will show a mix of points that fall into small, medium or strong range.

No matter how I try this, I keep arriving at a plot that splits apart my three distributions by the third variable that I am using to color code.

Code showing the problem is as follows:

library(dplyr)
library(ggplot2)
library(ggdist)

# Example data (replace this with your actual data)
set.seed(123)
data <- data.frame(
  variable = rep(c("Variable1", "Variable2", "Variable3"), each = 300),
  correlation = c(rnorm(300, 0.2, 0.1), rnorm(300, 0.4, 0.1), rnorm(300, 0.6, 0.1))
)

# Categorizing correlation strength
data <- data %>%
  mutate(correlation_strength = case_when(
    abs(correlation) < 0.3 ~ "Small",
    abs(correlation) >= 0.3 & abs(correlation) < 0.5 ~ "Medium",
    abs(correlation) >= 0.5 ~ "Strong"
  ))

ggplot(data, aes(x = variable, y = correlation, color = correlation_strength)) +
  stat_dots(quantiles = 50, position = position_dodge(width = 0.8)) + 
  scale_color_manual(values = c("Small" = "blue", "Medium" = "green", "Strong" = "red")) +
  theme_minimal() +
  labs(title = "Posterior Draws of Correlation Coefficients",
       subtitle = "Dots colored by correlation strength",
       x = "Variable",
       y = "Correlation Coefficient",
       color = "Correlation Strength")

In the ggplot aes argument, I’ve tried different combinations of passing slab_fill and group = NA to no avail.


Solution

  • You can use ggplot2::after_stat() to set the color after the statistical transformation has been applied.

    library(dplyr)
    library(ggplot2)
    library(ggdist)
    
    # Example data (replace this with your actual data)
    set.seed(123)
    data <- data.frame(
      variable = rep(c("Variable1", "Variable2", "Variable3"), each = 300),
      correlation = c(rnorm(300, 0.2, 0.1), rnorm(300, 0.4, 0.1), rnorm(300, 0.6, 0.1))
    )
    
    ggplot(data, aes(x = variable, y = correlation)) +
      stat_dots(
        aes(color = after_stat(
          case_when(
            abs(y) < 0.3 ~ "Small",
            abs(y) >= 0.3 & abs(y) < 0.5 ~ "Medium",
            abs(y) >= 0.5 ~ "Strong"
          )
        )),
        quantiles = 50, position = position_dodge(width = 0.8)
      ) +
      scale_color_manual(
        values = c("Small" = "blue", "Medium" = "green", "Strong" = "red")
      ) +
      theme_minimal() +
      labs(
        title = "Posterior Draws of Correlation Coefficients",
        subtitle = "Dots colored by correlation strength",
        x = "Variable",
        y = "Correlation Coefficient",
        color = "Correlation Strength"
      )