I would like to add a p-value to a scatter-plot, while respecting APA style
. This entails two elements: (a) an italicized p, and (b) stripping the leading zero (but also: formatting values smaller than .001 as < .001).
# Formatting formula
format.p <- function(p, precision = 0.001) {
digits <- -log(precision, base = 10)
p <- formatC(p, format = 'f', digits = digits)
p[p == formatC(0, format = 'f', digits = digits)] <- paste0('< ', precision)
sub("0", "", p)}
# Get p-value
(p = cor.test(mtcars$wt, mtcars$mpg)$p.value)
1.293959e-10
# Format p-value
(p = format.p(p))
"< .001"
# Make plot
library(ggplot2)
ggplot(mtcars,aes(x=wt,y=mpg)) +
stat_smooth(geom="line",method="lm")+
annotate(geom="text",label=paste0("p = ", p),x=4.5,y=25,size=8)
ggplot(mtcars,aes(x=wt,y=mpg)) +
stat_smooth(geom="line",method="lm") +
geom="text",label=paste0("italic('p')~'='",p),parse=T,x=4.5,y=25,size=8)
But notice then that we lost the stripped zero (the leading zero is back while we don't want it). Any idea how to fix this?
The key was to change label=paste0("italic('p')~'='", p)
to label=sprintf("italic('p')~'%s'", p)
.
Furthermore, in order to avoid having situations where the function would simultaneously output equal and smaller than signs (e.g., p = < .001
), I have also modified the format.p()
function to choose either <
or =
depending on the situation.
Here's the final solution:
# Formatting formula
format.p <- function(p, precision = 0.001) {
digits <- -log(precision, base = 10)
p <- formatC(p, format = 'f', digits = digits)
if (p < .001) {
p = paste0('< ', precision)}
if (p >= .001) {
p = paste0('= ', p) }
sub("0", "", p)
}
# Get p-value
(p = cor.test(mtcars$wt, mtcars$mpg)$p.value)
1.293959e-10
# Format p-value
(p = format.p(p))
"< .001"
# Make plot
library(ggplot2)
ggplot(mtcars,aes(x=wt,y=mpg)) +
stat_smooth(geom="line",method="lm")+
annotate(geom="text",label=sprintf("italic('p')~'%s'",p),parse=TRUE,x=4.5,y=25,size=8)