I am trying to conduct a multivariate normality test using the mvnormtest package on my data with two between-subjects variables, one within-subjects, and three dependent variables (binary categorical). My data looks like this (~5,600 rows with no missing data):
Cluster Group Trial Measure Measure2 Measure
1 4 1 1 1 0
1 4 1 0 0 0
1 4 1 1 1 0
1 4 1 1 1 0
1 4 1 1 1 1
1 4 1 1 1 1
1 4 1 1 1 0
1 4 1 1 1 0
Here is my setup:
data.df <- read.csv(
"data.csv",
header=TRUE, sep=","
)
attach(data.df)
names(data.df)
I attempted the following mshapiro.test()
#multivariate normality
dataMat <- data.matrix(data.df)
mshap <- mshapiro.test(dataMat)
I received the following error:
Error in solve.default(R %*% t(R), tol = 1e-18):
Lapack routine dgesv: system is exactly singular: U[7,7] = 0.
I checked on a forum from my stats class a year ago, and found that someone was able to work it by dividing the data into groups.
LowCluster <- t(dataMat[c(1:1877),1:6])
MedCluster <- t(dataMat[c(1878:3166),1:6])
HigCluster <- t(dataMat[c(3167:5364),1:6])
mshaplow <- mshapiro.test(LowCluster)
mshapmed <- mshapiro.test(MedCluster)
mshaphigh <- mshapiro.test(HigCluster)
I got the same error.
Error in solve.default(R %*% t(R), tol = 1e-18) :
Lapack routine dgesv: system is exactly singular: U[7,7] = 0
How can I resolve this?
A couple issues. First, the mshapiro.test
function requires the data to be in row format, so you need to use t()
to transpose your data.
But it will still fail due to singular matrix because you have columns that are exactly linear combinations of each other. For example, Group
is equal to 4*Cluster
, and Measure
is the same as Measure2
. Check out this discussion on singular matrices for more info.
Assuming you only want to test normality on the Measure
variables, here is a sample of code that will work to illustrate the singular matrix issue:
df2 <- data.df[,c(4, 5, 6)]
df2[8,1] = 0 # changing this value makes it so no column is a linear combo of any other column
mshapiro.test(t(df2))
But are all of your Measure
values 0 or 1? If so, why would you test for normality?