I have a dataframe with different groups ('label' column). For each label, I want to plot a null distribution obtained from bootstrapping (values are in the 'null' column) and the true statistic on top (value in the 'sc' column). Ideally, I would like the area after the statistic to have a different color, to mark that this is my p-value. Is this possible to do with stat_density_ridges
?
Here is an example R code:
library(ggplot2)
library(tidyverse)
library(ggridges)
df <- data.frame()
for (label in LETTERS) {
mean=rnorm(1,0.5,0.2)
null = rnorm(1000,mean,0.1);
sc = rnorm(1,0.5,0.2)
df <- rbind(df, data.frame(label=label, null=null, sc=sc))
}
df <- df %>%
mutate(label=as.factor(label))
ggplot(df, aes(x = null, y = label)) +
stat_density_ridges(scale=1.2,alpha = 1, size=1)+
scale_x_continuous(limits=c(0,1),breaks=seq(0,1,0.2)) +
geom_segment(aes(x=sc, xend=sc, y=as.numeric(label)-0.1, yend=as.numeric(label)+0.5), size=1) +
coord_flip()
The resulting figure is this:
But ideally, I would like each ridge to be more like this:
With the color changes after the sc value. Is that possible? Thanks :)
You could use the fill
with ..x..
to create different colors at a fixed x value of your plot. So the shaded area will be the same across all plots. You could modify this by using ggplot_build
with a separate dataframe that has the p_values which are the thresholds. So with these thresholds you could conditionally change the color
in the layer. Here is some reproducible code:
library(ggplot2)
library(tidyverse)
library(ggridges)
df <- data.frame()
set.seed(7) # for reproducibility
for (label in LETTERS) {
mean=rnorm(1,0.5,0.2)
null = rnorm(1000,mean,0.1);
sc = rnorm(1,0.5,0.2)
df <- rbind(df, data.frame(label=label, null=null, sc=sc))
}
df <- df %>%
mutate(label=as.factor(label))
# Create dataframe with p_values ranges per label
p_values = df %>%
group_by(label) %>%
summarise(p_value = unique(sc)) %>%
mutate(label = as.integer(label)) # make sure label is the same as in ggplot_build
# plot
p <- ggplot(df, aes(x = null, y = label, fill = ifelse(..x.. < sc, "no sign", "sign"), group = factor(label))) +
stat_density_ridges(geom = "density_ridges_gradient",,
scale=1.2,alpha = 1, size=1,
calc_ecdf = TRUE) +
scale_fill_manual(values = c("red", "blue"), name = "") +
coord_flip()
p
#> Picking joint bandwidth of 0.0224
# Modify layer
q <- ggplot_build(p)
#> Picking joint bandwidth of 0.0224
q$data[[1]] = q$data[[1]] %>%
left_join(., p_values,
by = c("group" = "label")) %>%
mutate(fill = case_when(x < p_value ~ fill,
TRUE ~ "blue")) %>%
select(-p_value)
q <- ggplot_gtable(q)
plot(q)
Created on 2023-03-28 with reprex v2.0.2
As you can see in the latest plot, the shaded areas are now according to the sc value of your dataframe per group.