I am trying to carry out a MANOVA. There are 7
dependent variables and a categorical independent variable representing 6
groups.
The data are available here: http://pastebin.com/fqXNjWtr
Click download above the text. I am reading the file with R
like this (I think the name of the downloaded file should be the same for you; I'm using a Macintosh operating system):
> df <- read.csv("~/downloads/fqXNjWtr.txt", stringsAsFactors = F)
> str(df)
'data.frame': 244 obs. of 8 variables:
$ var1 : num 0.3 0 0.312 0 0.643 ...
$ var2 : num 0 0.125 0 0.375 0.0714 ...
$ var3 : num 0 0.0625 0.0625 0 0.0714 ...
$ var4 : num 0.2 0.3125 0.0625 0.0625 0 ...
$ var5 : num 0.1 0.25 0.438 0.188 0 ...
$ var6 : num 0.2 0.0625 0.125 0.0625 0.0714 ...
$ var7 : num 0.2 0.188 0 0.312 0.143 ...
$ cluster_assignment: int 1 4 2 6 1 4 3 3 4 6 ...
I am then creating the dependent variable, DV
:
> df$DV <- as.matrix(df[, 1:7])
I am then carrying out the MANOVA:
> mv_out <- manova(DV ~ cluster_assignment, data = df)
Call:
manova(DV ~ cluster_assignment, data = df)
Terms:
cluster_assignment Residuals
resp 1 5.160838 6.738524
resp 2 3.384101 3.622020
resp 3 0.000200 3.365565
resp 4 0.065469 2.743549
resp 5 0.889180 8.019733
resp 6 0.442187 5.884827
resp 7 3.133188 7.736993
Deg. of Freedom 1 242
Residual standard errors: 0.1668686 0.1223398 0.1179292 0.1064752 0.1820423 0.1559406 0.1788045
Estimated effects may be unbalanced
When I then try the summary()
function, I get this error:
> summary(mv_out)
Error in summary.manova(mv_out) : residuals have rank 6 < 7
Based on some other posts, this seems to suggest that there are not enough observations given the number of variables, or that some of the predictors may be multicollinear. But this doesn't seem to be the case with this data:
> cor(df[, 1:7)
var1 var2 var3 var4 var5 var6 var7
var1 1.00000000 -0.417605243 -0.05274197 -0.118358341 -0.25617705 0.06089533 -0.4360312
var2 -0.41760524 1.000000000 -0.07181878 0.008873035 -0.29523300 -0.33954011 0.1958746
var3 -0.05274197 -0.071818782 1.00000000 0.131137673 -0.11624079 -0.14408909 -0.2951076
var4 -0.11835834 0.008873035 0.13113767 1.000000000 -0.14361455 -0.24308229 -0.1491373
var5 -0.25617705 -0.295233000 -0.11624079 -0.143614554 1.00000000 -0.03180183 -0.2383027
var6 0.06089533 -0.339540114 -0.14408909 -0.243082287 -0.03180183 1.00000000 -0.3215075
var7 -0.43603124 0.195874568 -0.29510761 -0.149137349 -0.23830275 -0.32150753 1.0000000
I'm puzzled about what may be going on.
You can resolve this error by setting the 'tol' parameter in ?summary.manova
. df$DV fails the rank deficient test with the default tol=1e-7 because the rowSums are 1. This might not produce the results you intended though.
summary(mv_out,tol=0)
Df Pillai approx F num Df den Df Pr(>F)
df$cluster_assignment 1 1.2106 -193.79 7 236
Residuals 242