I just performed a factorial ANOVA, followed by the TukeyHSD
post-test. Some of my adjusted P values from the TukeyHSD
output are 0.0000000
. Can these P values really be zero? Or is this a rounding situation, and my true P value might be something like 1e-17, that is rounded to 0.0000000
.
Are there any options for the TukeyHSD()
function in R that will give output P-values that contain exponents?
Here is a snippet of my output:
TukeyHSD(fit)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = lum ~ cells * treatment)
$`cells:treatment`
diff lwr upr p adj
NULL:a-KR:a -266.5833333 -337.887800 -195.2788663 0.0000000
WT:a-KR:a -196.3333333 -267.637800 -125.0288663 0.0000000
KR:ar-KR:a 83.4166667 12.112200 154.7211337 0.0053485
NULL:ar-KR:a -283.5000000 -354.804467 -212.1955330 0.0000000
WT:ar-KR:a -196.7500000 -268.054467 -125.4455330 0.0000000
KR:e-KR:a -219.0833333 -290.387800 -147.7788663 0.0000000
NULL:e-KR:a -185.0833333 -256.387800 -113.7788663 0.0000000
WT:e-KR:a -96.1666667 -167.471134 -24.8621996 0.0003216
EDIT: see warning below about resolution of Tukey p-values!!
dd <- data.frame(y=c(1:10,1001:1010),f=rep(c("A","B"),each=10))
fit <- aov(y~f,data=dd)
The printed p-value is zero:
(tt <- TukeyHSD(fit))
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = y ~ f, data = dd)
##
## $f
## diff lwr upr p adj
## B-A 1000 997.1553 1002.845 0
But looking at the (abbreviated) output of str()
shows there's more information there:
str(tt)
## List of 1
## $ f: num [1, 1:4] 1.00e+03 9.97e+02 1.00e+03 2.62e-14
## ..- attr(*, "dimnames")=List of 2
##
You can extract the value yourself:
tt$f[,"p adj"]
## [1] 2.620126e-14
Or as noted in the comments, print(tt,digits=15)
will work ...
WARNING
I decided to dig a little deeper and noticed in digging through the code of TukeyHSD.aov()
that it relies on ptukey()
, which in its "Examples" section warns that "the precision may not be more than about 8 digits". In particular, once the t-statistic is above about 30, the p-value maxes out (mins out?) at 2.62e-14
...
zval <- 10^seq(1,6,length=100)
pval <- ptukey(zval,2,18,lower.tail = FALSE)
par(las=1,bty="l")
plot(zval,pval,log="xy",type="l")
The bottom line is that you can't distinguish among p-values this small at all. You may need to rethink your strategy ...