rtidyversemultiple-columnsdcast

How to create multiple columns from one column, maybe using dcast or tidyverse


I am learning R and attempting to figure out splitting a column. I am looking to spread my data from a single column in wide format. I was told to use dcast, but i haven't figured out the best way and was going to try to pipe it through tidyverse.

# sample data
> data <- data.frame(trimesterPeriod = c(first, second, third, PP, third, second, PP, first )
# dataframe 
  trimesterPeriod 
1 first
2 second
3 third
4 PP
5 third
6 second
7 PP
8 first

and i would it to look like this:

#dataframe
ID     first       second       third       PP
1        1            0           0         0
2        0            1           0         0 
3        0            0           1         0
4        0            0           0         1 
5        0            0           1         0 
6        0            1           0         0 
7        0            0           0         1
8        1            0           0         0 

i know that i will have to change the trimesterPeriod data from a character, but from then i'm not sure where to go. i was thinking to do:

data.frame %>%
    mutate(rn = row_number(first, second, third, PP)) %>%
    spread(trimesterPeriod) %>%
    select(-rn)

but i'm not sure. any suggestions are greatly appreciated!


Solution

  • We could use table from base R

    table(seq_len(nrow(data)), data$trimesterPeriod)
    

    -output

        first PP second third
      1     1  0      0     0
      2     0  0      1     0
      3     0  0      0     1
      4     0  1      0     0
      5     0  0      0     1
      6     0  0      1     0
      7     0  1      0     0
      8     1  0      0     0
    

    Or using tidyverse

    library(dplyr)
    library(tidyr)
     data %>% 
       mutate(ID = row_number()) %>%
       pivot_wider(names_from = trimesterPeriod, 
         values_from = trimesterPeriod, values_fn = length, 
            values_fill = 0)
    

    -output

    # A tibble: 8 × 5
         ID first second third    PP
      <int> <int>  <int> <int> <int>
    1     1     1      0     0     0
    2     2     0      1     0     0
    3     3     0      0     1     0
    4     4     0      0     0     1
    5     5     0      0     1     0
    6     6     0      1     0     0
    7     7     0      0     0     1
    8     8     1      0     0     0
    

    data

    data <- structure(list(trimesterPeriod = c("first", "second", "third", 
    "PP", "third", "second", "PP", "first")),
     class = "data.frame", row.names = c("1", 
    "2", "3", "4", "5", "6", "7", "8"))