rdataframerecode

How to make all responses in a column into their own unique column in R


I currently have a dataset in R that is in long format and I'm trying to make it wide with a couple of specifications. So my dataset has a respondent ID and their gender along with one other column (let's say "fruits") that I'm interested in.

id <- c(1,1,1,2,3,4,4,5,5,5)
gender <- c("F","F","F","M","Unknown","F","F","M","M","M")
fruit <- c("pear", "apple", "banana", "pear", "strawberry","banana", "banana", "pear", "strawberry", "banana")
df <- cbind(id, gender, fruit)

      id  gender    fruit       
 [1,] "1" "F"       "pear"      
 [2,] "1" "F"       "apple"     
 [3,] "1" "F"       "banana"    
 [4,] "2" "M"       "pear"      
 [5,] "3" "Unknown" "strawberry"
 [6,] "4" "F"       "banana"    
 [7,] "4" "F"       "banana"    
 [8,] "5" "M"       "pear"      
 [9,] "5" "M"       "strawberry"
[10,] "5" "M"       "banana" 

My goal is to create a binary column for each potential response in that column to determine whether that individual provided that response at all throughout the dataset.

id <- c(1,2,3,4,5)
gender <- c("F","M","Unknown","F","M")
pear <- c(1,1,0,0,1)
apple <- c(1,0,0,0,0)
banana <- c(1,0,0,1,1)
strawberry <- c(0,0,1,0,1)
df2 <- cbind(id, gender, pear, apple, banana, strawberry)

     id  gender    pear apple banana strawberry
[1,] "1" "F"       "1"  "1"   "1"    "0"       
[2,] "2" "M"       "1"  "0"   "0"    "0"       
[3,] "3" "Unknown" "0"  "0"   "0"    "1"       
[4,] "4" "F"       "0"  "0"   "1"    "0"       
[5,] "5" "M"       "1"  "0"   "1"    "1"  

I'm not sure if this is important to note but while in it's original long format, a specific ID can have multiple rows with the same response to "fruits" like I have shown with my mock respondent 4. I hope that's clear; thanks very much in advance!


Solution

  • Here is a pure Tidyverse approach:

    library(tidyverse)
    
    df |>
      distinct() |>
      pivot_wider(
        names_from = fruit,
        values_from = fruit,
        values_fn = \(x) 1,
        values_fill = 0
      )
    #> # A tibble: 5 × 6
    #>   id    gender   pear apple banana strawberry
    #>   <chr> <chr>   <dbl> <dbl>  <dbl>      <dbl>
    #> 1 1     F           1     1      1          0
    #> 2 2     M           1     0      0          0
    #> 3 3     Unknown     0     0      0          1
    #> 4 4     F           0     0      1          0
    #> 5 5     M           1     0      1          1
    

    Created on 2023-04-12 with reprex v2.0.2