I've been given some data that I've combined into long form, but I need to get it into a certain format for a deliverable. I've tinkered with dataframe and list options and cannot seem to find a way to get the data I have into the output form I need. Any thoughts and solutions are appreciated.
If the desired output form seems odd for R, it is because other people will open the resulting data in Excel for additional study. So I will save the final data as a csv or Excel file. The full data in the desired form will have 40 rows (+header) and 110 columns (55 student and score pairs).
Here is example code for my long form data:
class | student | score |
---|---|---|
1 | a | 0.4977 |
1 | b | 0.7176 |
1 | c | 0.9919 |
1 | d | 0.3800 |
1 | e | 0.7774 |
2 | f | 0.9347 |
2 | g | 0.2121 |
2 | h | 0.6517 |
2 | i | 0.1256 |
2 | j | 0.2672 |
3 | k | 0.3861 |
3 | l | 0.0134 |
3 | m | 0.3824 |
3 | n | 0.8697 |
3 | o | 0.3403 |
Here is an example of how I need the final data to appear:
class_1_student | class_1_score | class_2_student | class_2_score | class_3_student | class_3_score |
---|---|---|---|---|---|
a | 0.4977 | f | 0.9347 | k | 0.3861 |
b | 0.7176 | g | 0.2121 | l | 0.0134 |
c | 0.9919 | h | 0.6517 | m | 0.3824 |
d | 0.3800 | i | 0.1256 | n | 0.8697 |
e | 0.7774 | j | 0.2672 | o | 0.3403 |
Here is R code to generate the sample long form and desired form data:
set.seed(1)
d <- data.frame(
class=c(rep(1,5), rep(2,5), rep(3,5)),
student=c(letters[1:5], letters[6:10], letters[11:15]),
score=round(runif(15, 0, 1),4)
)
d2 <- data.frame(
class_1_student = d[1:5,2],
class_1_score = d[1:5,3],
class_2_student = d[6:10,2],
class_2_score = d[6:10,3],
class_3_student = d[11:15,2],
class_3_score = d[11:15,3]
)
If it's helpful, I also have the student and score data in separate matrices (1 row per student and 1 column per class) that I could use to help generate the final data.
You can just split data:
library(tidyverse)
split(select(d, -class), d$class) %>%
imap(~setNames(.x, str_c("class", .y, names(.x), sep = "_"))) %>%
bind_cols()
Column binding will work only if the groups are of equal sizes.