rstatisticssurveyqualtricsvignette

Qualtrics survey, reducing columns


Me and some fellow students created a qualtrics survey for the course judicial lawmaking. We worked with 4 case vignettes. Each respondent first answered some general questions and then they answered one case. They were first asked whether alimony should be granted and in a second question they were asked how much. Only the ones who answered yes saw this second question. Now we imported the data to R. Since they only answered 1 case and 3 were left open, there are a lot of missing values. I am trying to create a dataset whitout all the unanswered questions? However, i only manage to get all the yes answers. On the other hand i managed to remove the NA, but then it seems like the first question is no longer linked to the second question. (if Q7 was answered yes, the next column should be Q8, but i see the first column says Q7 and the second column says Q12 for example. I will add the code i wrote but i am a law student so my understanding of everything is rather limited. I added a simplified example. The numbers from 1 to 4 represent the 4 different cases.

    age <- c("18-30","18-30","31-45", 60)
YesNo1 <- c("Yes", NA,NA,NA)
Height1 <- c(250,NA,NA,NA)
YesNo2 <- c(NA,"NO",NA,NA)
Height2 <- c(NA,NA,NA,NA)
YesNo3 <- c(NA,NA,"Yes", NA)
Height3 <- c(NA,NA,320,NA)
YesNo4 <- c(NA,NA,NA,"yes")
Height4 <- c(NA,NA,NA, 290)

Test <- data.frame(age, YesNo1, Height1, YesNo2, Height2, 
                  YesNo3, Height3, YesNo4,Height4)


#inspect the data
Test


# reduce the columns 

mi <- pivot_longer(Test, c(YesNo1, YesNo2, YesNo3, YesNo4), 
                         names_to = "decision", values_to = "yes/no")

mi1 <- pivot_longer(mi, c(Height1, Height2, Height3, Height4), 
                    names_to = "alimony", values_to = "height")

#drop the NA rows
mi2 <- mi1 %>% drop_na('yes/no')

In an ideal world i would like to have one dataset with the general questions followed by a column with the number of the yes or no question and the column with the answer. And then a column with the number of the question how much alimony should be granted and a column with the answer. (the numbers of the question should always matchs (7and8, 9and10...) I hope this is clear and someone can help me with it. I translated my problem to a simplified version. when one runs it in R, u can see there is 4 times Yes, and 4 times no. I only want to keep 1 yes and 1 no. But i cant delete the remaining rows with NA in since it will also delete the No answered question. Do you have any idea how i can fix it please?


Solution

  • Apparently you want to use tidyr. I am not fit with the tidyverse so I'd like to show you a approach using standard R and the stack function. Taking your data example

    Height1 <- c(250,NA,NA,NA)
    YesNo2 <- c(NA,"NO",NA,NA)
    Height2 <- c(NA,NA,NA,NA)
    YesNo3 <- c(NA,NA,"Yes", NA)
    Height3 <- c(NA,NA,320,NA)
    YesNo4 <- c(NA,NA,NA,"yes")
    Height4 <- c(NA,NA,NA, 290)
    
    Test <- data.frame(age, YesNo1, Height1, YesNo2, Height2, 
                       YesNo3, Height3, YesNo4,Height4)
    

    we can now stack the YesNo columns and the Heightcolumns on top of each other, calling the result stacked:

    stacked <- data.frame(age = Test$age,
                   yesno = stack(Test, select = c("YesNo1", "YesNo2", "YesNo3", "YesNo4")),
                   height = stack(Test, select = c("Height1", "Height2", "Height3", "Height4"))
                    )
    

    If you print(stacked) you'll see a lot of NA. So in the next (and final) step, we delete all those columns that have an NA in the yesnocolumn:

    stacked <- stacked[!is.na(stacked$yesno.values),]
    print(stacked)
    

    And the result is what I understood from your question to be the goal:

    > print(stacked)
         age yesno.values yesno.ind height.values height.ind
    1  18-30          Yes    YesNo1           250    Height1
    6  18-30           NO    YesNo2            NA    Height2
    11 31-45          Yes    YesNo3           320    Height3
    16    60          yes    YesNo4           290    Height4
    

    Sorry for this not being a tidyverse answer. At least, the No answer was kept in the data.