I'm looking to create a plot in R that dynamically retrieves and displays the results of a statistical test as an annotated text directly on the plot. I want to achieve precise formatting for the statistical notation within this annotation.
set.seed(1234)
# Create a dataset:
df <- data.frame(
id = (1:100),
resp = sample(c("Yes", "No"), 100, replace=T))
Run and store a chi-square test:
mytest <- chisq.test(table(df$resp))
Store the odds ratio:
myodds <- table(df$resp)[1]/table(df$resp)[2]
Create a plot:
myplot <- ggplot(df, aes(x=factor(resp), fill=resp))+
geom_bar(stat="count", width = 0.5) +
ylab("N of occurences")+
scale_x_discrete(name="Selected options") +
theme_classic()+
theme(legend.position = "none")+
scale_fill_grey(start=0.8,end =0) +
coord_cartesian(ylim=c(0, 100))+
scale_y_continuous(expand = c(0, 0))
Now, let's add the requested information for chi-square test and odds ratio as an annotated text:
myplot +
ggtitle("A plot")+
annotate("text", x =0.5, y = Inf,
hjust= 0, vjust=2,
label =
paste0("x2 = ",round(mytest$statistic, 2),
" , ",
if(mytest$p.value <.001){
paste0("p < .001")
}else{
paste0("p = ",round(mytest$p.value, 2))
},
"\nOdds = ",round(myodds, 2)))
So, here is my question. How can I turn "2" into a superscript (i.e., present it as "x^2"), and "p" into italics (i.e, "p") without sacrificing the dynamic nature of the retrieved information? I checked out some similar solutions, but they do not seem to work when there are more than two pieces of information that are dynamically retrieved. (e.g., see ggplot2 annotation with superscripts, Paste string with superscript in ggplot, Using italics and non-italics in the same category label)
Some seemingly similar questions (How to add a complex label with italics and a variable to ggplot?
ggplot format italic annotation do not address the above issue.
However, the responses provided below from r2evans and Michiel Duvekot perfectly address the issue I encountered.
You could use bquote() if you want to use variable or an expression in the label:
library(ggplot2)
set.seed(1234)
df <- data.frame(
id = (1:100),
resp = sample(c("Yes", "No"), 100, replace = T)
)
mytest <- chisq.test(table(df$resp))
myodds <- table(df$resp)[1] / table(df$resp)[2]
myplot <- ggplot(df, aes(x = factor(resp), fill = resp)) +
geom_bar(stat = "count", width = 0.5) +
ylab("N of occurences") +
scale_x_discrete(name = "Selected options") +
theme_classic() +
theme(legend.position = "none") +
scale_fill_grey(start = 0.8, end = 0) +
coord_cartesian(ylim = c(0, 100)) +
scale_y_continuous(expand = c(0, 0))
myplot +
annotate(
"text",
x = 0.5,
y = Inf,
hjust = 0,
vjust = 2,
label = ifelse(
mytest$p.value < 0.001,
deparse(bquote(
paste(
x^2 == ..(round(mytest$statistic, 2)),
", ",
italic(p) < .001,
", ",
Odds == ..(round(myodds, 2))
),
splice = TRUE
)),
deparse(bquote(
paste(
x^2 == ..(round(mytest$statistic, 2)),
", ",
italic(p) == ..(round(mytest$p.value, 2)),
", ",
Odds == ..(round(myodds, 2))
),
splice = TRUE
))
),
parse = TRUE
)
Created on 2025-07-16 with reprex v2.1.1