rmanova

summary.manova output shows different p values from the summary.manova stats table and broom tidy()


I noticed that the summary.manova() function in R produces two different p.values. One in a table that is printed in the console and the other in the stats table located in the summary object. What p.values should be reported? The values are slightly different. I first noticed this problem when using the tidy() function from broom, it was reporting p.values from the stats table and not the console.

I can recreate the problem using the iris data frame:

head(iris)
fit = manova(as.matrix(iris[,1:4]) ~ Species, data = iris)
fit_summary = summary.manova(fit, test = "Wilks")
fit_summary #output1
fit_summary$stats #output2
broom::tidy(fit, test = "Wilks") #output2 

Solution

  • Nice reproducible example! From everything I can see here, the only differences are in output representation, not in the underlying values.

    In the printed summary output, p-values less than a threshold are printed only as "<2.2e-16" (on the theory that you probably shouldn't be worrying about differences among tiny p-values anyway ...)

    fit_summary #output1
               Df    Wilks approx F num Df den Df    Pr(>F)    
    Species     2 0.023439   199.15      8    288 < 2.2e-16 ***
    Residuals 147                                              
    ---
    Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
    

    If you explicitly extract the $stats component, then you get a value printed to R's default 7-digit precision:

    > fit_summary$stats #output2
               Df      Wilks approx F num Df den Df        Pr(>F)
    Species     2 0.02343863 199.1453      8    288 1.365006e-112
    Residuals 147         NA       NA     NA     NA            NA
    

    If you use tidy, it returns a tibble rather than a data frame, which has a different set of defaults for output precision (i.e., it only reports 3 significant digits).

    > broom::tidy(fit, test = "Wilks") 
    # A tibble: 2 x 7
      term         df   wilks statistic num.df den.df    p.value
      <chr>     <dbl>   <dbl>     <dbl>  <dbl>  <dbl>      <dbl>
    1 Species       2  0.0234      199.      8    288  1.37e-112
    2 Residuals   147 NA            NA      NA     NA NA  
    
      
    

    All of these defaults can be reset: for example, ?tibble::formatting tells you that options(pillar.sigfig=7) will set the significant digits for tibble-printing to 7; ?options tells you that you can use options(digits=n) to change the defaults for base-R printing.