I have conducted a survey with limesurvey and have exported the results as csv.-file, which I import into R.
One of the questions is a multiple choice question, in which the participants could name the subjects they study. The output from limesurvey looks somewhat like this (but with more subjects and more participants):
Participant | Maths | Physics | English | Biology
1 | Y | | Y |
2 | | Y | Y |
3 | Y | Y | | Y
I'd like to get a result that looks like this
Participant | Subject 1 | Subject 2| Subject 3 |
1 | Maths | English | |
2 | Physics | English | |
3 | Maths | Physics | Biology |
I'd be grateful for any pointers.
Here's my attempt to generate the expected dataframe as requested:
library(tidyverse)
library(gtools)
rand_list = c('Y', NA)
df = data.frame(participant = seq(1,10, by = 1), # r starts counting from 0
Maths = sample(rand_list, 10, replace = TRUE),
Physics = sample(rand_list, 10, replace = TRUE),
English = sample(rand_list, 10, replace = TRUE),
Biology = sample(rand_list, 10, replace = TRUE))
df_to_new_format = function(data){
vector_subject = colnames(data)
vector_new_col = c()
for (i in 1:length(vector_subject)){
if (i == 1){
new_col = 'participant'
vector_new_col <- c(vector_new_col, new_col)
rm(new_col)
} else{
new_col = paste('Subject', as.character(i - 1))
vector_new_col <- c(vector_new_col, new_col)
rm(new_col)
}
}
for (j in 1:length(vector_subject)){
if (j == 1){
next
} else{
data[[j]] <- recode(data[[j]], 'Y' = vector_subject[j])
}
}
colnames(data) <- vector_new_col
return(data)
}
df = df_to_new_format(data = df)
df_new_format = c()
for (m in 1:nrow(df)){
temp = mixedsort(as.matrix(df[m,]))
print(temp)
df_new_format = rbind(df_new_format, temp)
}
df_new_format = as.data.frame(df_new_format, row.names = FALSE)
colnames(df_new_format) = colnames(df)