rlapackmanova

mshapiro.test 'Error in solve.default(R %*% t(R), tol = 1e-18) : Lapack routine dgesv: system is exactly singular: U[7,7] = 0'


I am trying to conduct a multivariate normality test using the mvnormtest package on my data with two between-subjects variables, one within-subjects, and three dependent variables (binary categorical). My data looks like this (~5,600 rows with no missing data):

Cluster Group   Trial   Measure Measure2    Measure
    1   4   1   1   1   0
    1   4   1   0   0   0
    1   4   1   1   1   0
    1   4   1   1   1   0
    1   4   1   1   1   1
    1   4   1   1   1   1
    1   4   1   1   1   0
    1   4   1   1   1   0

Here is my setup:

data.df <- read.csv(
"data.csv", 
  header=TRUE, sep=","
  )

attach(data.df)
names(data.df)

I attempted the following mshapiro.test()

#multivariate normality
dataMat <- data.matrix(data.df)
mshap <- mshapiro.test(dataMat)

I received the following error:

Error in solve.default(R %*% t(R), tol = 1e-18): 
Lapack routine dgesv: system is exactly singular: U[7,7] = 0. 

I checked on a forum from my stats class a year ago, and found that someone was able to work it by dividing the data into groups.

LowCluster <- t(dataMat[c(1:1877),1:6])
MedCluster <- t(dataMat[c(1878:3166),1:6])
HigCluster <- t(dataMat[c(3167:5364),1:6])
mshaplow <- mshapiro.test(LowCluster)
mshapmed <- mshapiro.test(MedCluster)
mshaphigh <- mshapiro.test(HigCluster)

I got the same error.

Error in solve.default(R %*% t(R), tol = 1e-18) : 
Lapack routine dgesv: system is exactly singular: U[7,7] = 0

How can I resolve this?


Solution

  • A couple issues. First, the mshapiro.test function requires the data to be in row format, so you need to use t() to transpose your data.

    But it will still fail due to singular matrix because you have columns that are exactly linear combinations of each other. For example, Group is equal to 4*Cluster, and Measure is the same as Measure2. Check out this discussion on singular matrices for more info.

    Assuming you only want to test normality on the Measure variables, here is a sample of code that will work to illustrate the singular matrix issue:

    df2 <- data.df[,c(4, 5, 6)]
    df2[8,1] = 0 # changing this value makes it so no column is a linear combo of any other column
    mshapiro.test(t(df2))
    

    But are all of your Measure values 0 or 1? If so, why would you test for normality?