I am trying to perform a MANOVA on a tidy dataframe that looks somewhat like the following. "id" refers to the participant number. The independent variables are "init_cont" (with values of I or K) and "family" (with values of C, S, or D), making for a 2x3 design. The column "qnumber" refers to the number of the question participants answer, with each participant answering 3 questions. "value" is each participant's response to a particular question.
id init_cont family qnumber value
1 I C 1 3.5
1 I C 2 2
1 I C 3 4
2 K C 1 2
2 K C 2 5
2 K C 3 3
3 K S 1 4.5
3 K S 2 5
3 K S 3 3
4 K D 1 1
4 K D 2 7.5
4 K D 3 3
What is the best way for me to perform a MANOVA on this data? I am interested in the interactions between the independent variables and how they impact the "value" for each of the 3 questions. In case it is relevant, my actual dataset has 14 different questions.
I have considered reorganizing the data in the following format, but I am unsure how to do this in R. The numbers after "value" in each new column are from "qnumber".
id init_cont family value1 value2 value3
1 I C 3.5 2 4
2 K C 2 5 3
3 K S 4.5 5 3
4 K D 1 7.5 3
dplyr::spread
does the first part of your problem easily.
df %>% spread(qnumber, value)
# id init_cont family 1 2 3
# 1 1 I C 3.5 2.0 4
# 2 2 K C 2.0 5.0 3
# 3 3 K S 4.5 5.0 3
# 4 4 K D 1.0 7.5 3
Here is the reproducible data.
t <- 'id init_cont family qnumber value
1 I C 1 3.5
1 I C 2 2
1 I C 3 4
2 K C 1 2
2 K C 2 5
2 K C 3 3
3 K S 1 4.5
3 K S 2 5
3 K S 3 3
4 K D 1 1
4 K D 2 7.5
4 K D 3 3'
df <- read.table(text = t, header = TRUE)