ranovamixedorthogonal

Error when running mixed anova in R with two within factor and one between factor


I would like some help to use R to analyze an experiment in which three groups of participants were shown two types of stimulus three times each. The dependent variable is a continuous measure. Here is an example of how the dataset looks like.

 SubjectID  Group        Trial StimType    Measure
1          1 group1 trial3_stimA        A 0.55908866
2          2 group1 trial3_stimA        A 0.98884446
3          3 group2 trial3_stimA        A 0.00000000
4          4 group2 trial3_stimA        A 0.27067991
5          5 group3 trial3_stimA        A 0.37169285
6          6 group3 trial3_stimA        A 0.42113984
7          1 group1 trial3_stimB        B 0.00000000
8          2 group1 trial3_stimB        B 0.49892807
9          3 group2 trial3_stimB        B 0.14602589
10         4 group2 trial3_stimB        B 0.50946555
11         5 group3 trial3_stimB        B 0.25572820
12         6 group3 trial3_stimB        B 0.22932966
13         1 group1 trial1_stimA        A 0.42207604
14         2 group1 trial1_stimA        A 0.85599588
15         3 group2 trial1_stimA        A 0.36428381
16         4 group2 trial1_stimA        A 0.46679336
17         5 group3 trial1_stimA        A 0.69379734
18         6 group3 trial1_stimA        A 0.55607716
19         1 group1 trial1_stimB        B 0.24261465
20         2 group1 trial1_stimB        B 0.35176384
21         3 group2 trial1_stimB        B 0.21116215
22         4 group2 trial1_stimB        B 0.33112544
23         5 group3 trial1_stimB        B 0.00000000
24         6 group3 trial1_stimB        B 0.00000000
25         1 group1 trial2_stimA        A 0.05506943
26         2 group1 trial2_stimA        A 0.22537470
27         3 group2 trial2_stimA        A 0.00000000
28         4 group2 trial2_stimA        A 0.18511144
29         5 group3 trial2_stimA        A 0.15586156
30         6 group3 trial2_stimA        A 0.04467100
31         1 group1 trial2_stimB        B 0.03890585
32         2 group1 trial2_stimB        B 0.29787709
33         3 group2 trial2_stimB        B 0.00000000
34         4 group2 trial2_stimB        B 0.28971992
35         5 group3 trial2_stimB        B 0.12993238
36         6 group3 trial2_stimB        B 0.05066011

Here is the structure of my data

'data.frame':   36 obs. of  5 variables:
 $ SubjectID: Factor w/ 6 levels "1","2","3","4",..: 1 2 3 4 5 6 1 2 3 4 ...
 $ Group    : Factor w/ 3 levels "group1","group2",..: 1 1 2 2 3 3 1 1 2 2 ...
 $ Trial    : Factor w/ 6 levels "trial1_stimA",..: 5 5 5 5 5 5 6 6 6 6 ...
 $ StimType : Factor w/ 2 levels "A","B": 1 1 1 1 1 1 2 2 2 2 ...
 $ Measure  : num  0.559 0.989 0 0.271 0.372 ...

I need to run a mixed ANOVA with groups as between subjects factor and trials and stimulus type as within subjects factor. I have tried using three different R packages and different syntaxes but R either returns an error message or the output is missing the interactions group x trial x stim type.

For example, when I use anova_test() from the rstatix package

#Trying mixed ANOVA with rstatix
> 
> mixed.anova <- anova_test(
+   data = prepared_data, dv = Measure, wid = SubjectID,
+   between = Group, within = c(Trial,StimType)
+ )
Error in check.imatrix(X.design) : 
  Terms in the intra-subject model matrix are not orthogonal.
> get_anova_table(all_subjects)
Error in is.data.frame(x) : object 'all_subjects' not found

When I use the the aov from afex package

> #using afex
> 
> 
> mixed.anova2 <- aov_car(Measure ~ Group*Trial*StimType + Error(1|SubjectID/(Trial*StimType)), prepared_data)
Error: Empty cells in within-subjects design  (i.e., bad data structure).
table(data[c("Trial", "StimType")])
#               StimType
# Trial          A B
#   trial1_stimA 6 0
#   trial1_stimB 0 6
#   trial2_stimA 6 0
#   trial2_stimB 0 6
#   trial3_stimA 6 0
#   trial3_stimB 0 6
> 
> aov.bww

Call:
aov(formula = SCR ~ Group * Trial * CSType + Error(SubjectID) + 
    Group, data = sixPhasesAbs2)

Grand Mean: 397.1325

Stratum 1: SubjectID

Terms:
                     Group  Residuals
Sum of Squares   187283464 6399838881
Deg. of Freedom          2         81

Residual standard error: 8888.777
18 out of 20 effects not estimable
Estimated effects may be unbalanced

Stratum 2: Within

Terms:
                      Trial Group:Trial   Residuals
Sum of Squares    158766098   374583941 12799639740
Deg. of Freedom           5          10         405

Residual standard error: 5621.748
12 out of 27 effects not estimable
Estimated effects may be unbalanced

Finally, I tried using lmer

> #using lmer
> model = lmer(Measure ~Group*Trial*StimType +(1|SubjectID), data = prepared_data)
fixed-effect model matrix is rank deficient so dropping 18 columns / coefficients
> 
> summary(model)
Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
Formula: Measure ~ Group * Trial * StimType + (1 | SubjectID)
   Data: prepared_data

REML criterion at convergence: -16.8

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-1.1854 -0.5424  0.0000  0.5424  1.1854 

Random effects:
 Groups    Name        Variance Std.Dev.
 SubjectID (Intercept) 0.024226 0.15565 
 Residual              0.006861 0.08283 
Number of obs: 36, groups:  SubjectID, 6

Fixed effects:
                               Estimate Std. Error        df t value Pr(>|t|)    
(Intercept)                    0.639036   0.124672  4.459241   5.126 0.005095 ** 
Groupgroup2                   -0.223497   0.176313  4.459241  -1.268 0.267088    
Groupgroup3                   -0.014099   0.176313  4.459241  -0.080 0.939728    
Trialtrial1_stimB             -0.341847   0.082829 15.000000  -4.127 0.000896 ***
Trialtrial2_stimA             -0.498814   0.082829 15.000000  -6.022 2.34e-05 ***
Trialtrial2_stimB             -0.470644   0.082829 15.000000  -5.682 4.35e-05 ***
Trialtrial3_stimA              0.134931   0.082829 15.000000   1.629 0.124124    
Trialtrial3_stimB             -0.389572   0.082829 15.000000  -4.703 0.000283 ***
Groupgroup2:Trialtrial1_stimB  0.197452   0.117138 15.000000   1.686 0.112550    
Groupgroup3:Trialtrial1_stimB -0.283091   0.117138 15.000000  -2.417 0.028865 *  
Groupgroup2:Trialtrial2_stimA  0.175831   0.117138 15.000000   1.501 0.154095    
Groupgroup3:Trialtrial2_stimA -0.025857   0.117138 15.000000  -0.221 0.828271    
Groupgroup2:Trialtrial2_stimB  0.199966   0.117138 15.000000   1.707 0.108413    
Groupgroup3:Trialtrial2_stimB -0.063997   0.117138 15.000000  -0.546 0.592870    
Groupgroup2:Trialtrial3_stimA -0.415129   0.117138 15.000000  -3.544 0.002946 ** 
Groupgroup3:Trialtrial3_stimA -0.363452   0.117138 15.000000  -3.103 0.007276 ** 
Groupgroup2:Trialtrial3_stimB  0.301779   0.117138 15.000000   2.576 0.021070 *  
Groupgroup3:Trialtrial3_stimB  0.007164   0.117138 15.000000   0.061 0.952043    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation matrix not shown by default, as p = 18 > 12.
Use print(x, correlation=TRUE)  or
    vcov(x)        if you need it

fit warnings:
fixed-effect model matrix is rank deficient so dropping 18 columns / coefficients
> 
> anova(model)
Missing cells for: Trialtrial1_stimB:StimTypeA, Trialtrial2_stimB:StimTypeA, Trialtrial3_stimB:StimTypeA, Trialtrial1_stimA:StimTypeB, Trialtrial2_stimA:StimTypeB, Trialtrial3_stimA:StimTypeB, Groupgroup1:Trialtrial1_stimB:StimTypeA, Groupgroup2:Trialtrial1_stimB:StimTypeA, Groupgroup3:Trialtrial1_stimB:StimTypeA, Groupgroup1:Trialtrial2_stimB:StimTypeA, Groupgroup2:Trialtrial2_stimB:StimTypeA, Groupgroup3:Trialtrial2_stimB:StimTypeA, Groupgroup1:Trialtrial3_stimB:StimTypeA, Groupgroup2:Trialtrial3_stimB:StimTypeA, Groupgroup3:Trialtrial3_stimB:StimTypeA, Groupgroup1:Trialtrial1_stimA:StimTypeB, Groupgroup2:Trialtrial1_stimA:StimTypeB, Groupgroup3:Trialtrial1_stimA:StimTypeB, Groupgroup1:Trialtrial2_stimA:StimTypeB, Groupgroup2:Trialtrial2_stimA:StimTypeB, Groupgroup3:Trialtrial2_stimA:StimTypeB, Groupgroup1:Trialtrial3_stimA:StimTypeB, Groupgroup2:Trialtrial3_stimA:StimTypeB, Groupgroup3:Trialtrial3_stimA:StimTypeB.  
Interpret type III hypotheses with care.
Type III Analysis of Variance Table with Satterthwaite's method
                      Sum Sq  Mean Sq NumDF DenDF F value    Pr(>F)    
Group                0.00723 0.003614     2     3  0.5267 0.6367151    
Trial                0.96171 0.192343     5    15 28.0356 4.172e-07 ***
Group:Trial          0.44102 0.044102    10    15  6.4283 0.0007411 ***
StimType                                                               
Group:StimType                                                         
Trial:StimType                                                         
Group:Trial:StimType                                                   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

I have searched for similar questions but I have not found any answers that would help me with this problem. I suspect (given the error messages) that I might have to change the structure of the dataset, but I do not know how to do this as I am a beginner in R. How can I run a mixed ANOVA in this dataset?


Solution

  • Since Trial contains information about both Trial and StimType, the fixed effect model matrix is rank deficient. One can see this from the afex::aov_car() output in the original post that illustrates empty cells that are the effect of Trial containing information from two distinct variables.

    > mixed.anova2 <- aov_car(Measure ~ Group*Trial*StimType + Error(1|SubjectID/(Trial*StimType)), prepared_data)
    Error: Empty cells in within-subjects design  (i.e., bad data structure).
    table(data[c("Trial", "StimType")])
    #               StimType
    # Trial          A B
    #   trial1_stimA 6 0
    #   trial1_stimB 0 6
    #   trial2_stimA 6 0
    #   trial2_stimB 0 6
    #   trial3_stimA 6 0
    #   trial3_stimB 0 6
    > 
    > aov.bww
    

    We can correct the rank deficiency by splitting the Trial variable into two variables.

    With the lme4 package and tidyr::separate() to separate trial from stim value, the analysis looks like this:

    textFile <- "rowId SubjectID  Group        Trial StimType    Measure
    1          1 group1 trial3_stimA        A 0.55908866
    2          2 group1 trial3_stimA        A 0.98884446
    3          3 group2 trial3_stimA        A 0.00000000
    4          4 group2 trial3_stimA        A 0.27067991
    5          5 group3 trial3_stimA        A 0.37169285
    6          6 group3 trial3_stimA        A 0.42113984
    7          1 group1 trial3_stimB        B 0.00000000
    8          2 group1 trial3_stimB        B 0.49892807
    9          3 group2 trial3_stimB        B 0.14602589
    10         4 group2 trial3_stimB        B 0.50946555
    11         5 group3 trial3_stimB        B 0.25572820
    12         6 group3 trial3_stimB        B 0.22932966
    13         1 group1 trial1_stimA        A 0.42207604
    14         2 group1 trial1_stimA        A 0.85599588
    15         3 group2 trial1_stimA        A 0.36428381
    16         4 group2 trial1_stimA        A 0.46679336
    17         5 group3 trial1_stimA        A 0.69379734
    18         6 group3 trial1_stimA        A 0.55607716
    19         1 group1 trial1_stimB        B 0.24261465
    20         2 group1 trial1_stimB        B 0.35176384
    21         3 group2 trial1_stimB        B 0.21116215
    22         4 group2 trial1_stimB        B 0.33112544
    23         5 group3 trial1_stimB        B 0.00000000
    24         6 group3 trial1_stimB        B 0.00000000
    25         1 group1 trial2_stimA        A 0.05506943
    26         2 group1 trial2_stimA        A 0.22537470
    27         3 group2 trial2_stimA        A 0.00000000
    28         4 group2 trial2_stimA        A 0.18511144
    29         5 group3 trial2_stimA        A 0.15586156
    30         6 group3 trial2_stimA        A 0.04467100
    31         1 group1 trial2_stimB        B 0.03890585
    32         2 group1 trial2_stimB        B 0.29787709
    33         3 group2 trial2_stimB        B 0.00000000
    34         4 group2 trial2_stimB        B 0.28971992
    35         5 group3 trial2_stimB        B 0.12993238
    36         6 group3 trial2_stimB        B 0.05066011
    "
    data <- read.table(text = textFile,header = TRUE)
    
    library(dplyr)
    library(tidyr)
    # split trial from Stim
    data %>% separate(Trial,into = c("trialName","stimName")) -> data
    library(lme4)
    model = lmer(Measure ~Group*trialName*StimType +(1|SubjectID), data = data)
    
    summary(model)
    

    ...and the output:

    > summary(model)
    Linear mixed model fit by REML ['lmerMod']
    Formula: Measure ~ Group * trialName * StimType + (1 | SubjectID)
       Data: data
    
    REML criterion at convergence: -16.8
    
    Scaled residuals: 
        Min      1Q  Median      3Q     Max 
    -1.1854 -0.5424  0.0000  0.5424  1.1854 
    
    Random effects:
     Groups    Name        Variance Std.Dev.
     SubjectID (Intercept) 0.024226 0.15565 
     Residual              0.006861 0.08283 
    Number of obs: 36, groups:  SubjectID, 6
    
    Fixed effects:
                                          Estimate Std. Error t value
    (Intercept)                            0.63904    0.12467   5.126
    Groupgroup2                           -0.22350    0.17631  -1.268
    Groupgroup3                           -0.01410    0.17631  -0.080
    trialNametrial2                       -0.49881    0.08283  -6.022
    trialNametrial3                        0.13493    0.08283   1.629
    StimTypeB                             -0.34185    0.08283  -4.127
    Groupgroup2:trialNametrial2            0.17583    0.11714   1.501
    Groupgroup3:trialNametrial2           -0.02586    0.11714  -0.221
    Groupgroup2:trialNametrial3           -0.41513    0.11714  -3.544
    Groupgroup3:trialNametrial3           -0.36345    0.11714  -3.103
    Groupgroup2:StimTypeB                  0.19745    0.11714   1.686
    Groupgroup3:StimTypeB                 -0.28309    0.11714  -2.417
    trialNametrial2:StimTypeB              0.37002    0.11714   3.159
    trialNametrial3:StimTypeB             -0.18266    0.11714  -1.559
    Groupgroup2:trialNametrial2:StimTypeB -0.17332    0.16566  -1.046
    Groupgroup3:trialNametrial2:StimTypeB  0.24495    0.16566   1.479
    Groupgroup2:trialNametrial3:StimTypeB  0.51946    0.16566   3.136
    Groupgroup3:trialNametrial3:StimTypeB  0.65371    0.16566   3.946
    
    Correlation matrix not shown by default, as p = 18 > 12.
    Use print(x, correlation=TRUE)  or
        vcov(x)        if you need it
    
    > 
    

    UPDATE: Parsing effect names

    In the comments to my answer, the questioner explained that the data used to describe trials and stim types were different than what was represented in the data posted with the question. Given the contents of the comment, here is an approach with dplyr to parse out the trial conditions and stim types.

    # solving the data cleaning problem in comments
    
    df <- data.frame(treatmentName = c("MeanCondPlusLate",
                       "MeanCondMinusLate", 
                       "MeanExtPlusLate", 
                       "MeanExtMinusLate", 
                       "TestPlus1",
                       "TestMinus1"))
    
    library(dplyr)
    df %>% mutate(trial = case_when(grepl("Cond",treatmentName) == TRUE ~ 1,
                                    grepl("Ext",treatmentName) == TRUE ~ 2,
                                    grepl("Test",treatmentName) == TRUE ~ 3), 
                  stimLevel = case_when(grepl("Plus",treatmentName) == TRUE ~ "A",
                                        grepl("Minus",treatmentName) == TRUE ~ "B"))
    

    ...and the output:

          treatmentName trial stimLevel
    1  MeanCondPlusLate     1         A
    2 MeanCondMinusLate     1         B
    3   MeanExtPlusLate     2         A
    4  MeanExtMinusLate     2         B
    5         TestPlus1     3         A
    6        TestMinus1     3         B
    >