I am having trouble understanding how I can use dcast (or any other function) to restructure my data frame.
I was given a data frame that looks something like this:
Patient | DOB | Gender | variable | value |
---|---|---|---|---|
1234 | 2-12-19 | F | Age | 25 |
1235 | 2-13-19 | M | Age | 25 |
1236 | 2-14-19 | F | BMI | 25 |
1237 | 2-15-19 | M | Age | 25 |
1238 | 2-16-19 | F | Height | 55 |
1239 | 2-17-19 | F | Age | 25 |
I want to be able to produce a data frame where each of the variables in the variable column are there own columns with their respective values.
I am having trouble understanding how dcast can be used when there are multiple different variables in one column to sort.
I want my final data frame to look something like this:
Patient | DOB | Gender | Age | BMI | Height |
---|---|---|---|---|---|
1234 | 2-12-19 | F | 25 | 25 | 55 |
1235 | 2-13-19 | M | 25 | 14 | 34 |
1236 | 2-14-19 | F | 25 | 30 | 20 |
1237 | 2-15-19 | M | 25 | 45 | 25 |
1238 | 2-16-19 | F | 55 | 25 | 13 |
1239 | 2-17-19 | F | 25 | 56 | 40 |
You may use the following code to spread out the variables in variable
column:
library(tidyr)
df %>%
pivot_wider(names_from = variable, values_from = value)
# A tibble: 6 x 6
Patient DOB Gender Age BMI Height
<int> <chr> <chr> <int> <int> <int>
1 1234 2-12-19 F 25 NA NA
2 1235 2-13-19 M 25 NA NA
3 1236 2-14-19 F NA 25 NA
4 1237 2-15-19 M 25 NA NA
5 1238 2-16-19 F NA NA 55
6 1239 2-17-19 F 25 NA NA