I was interested in recoding non-mutually exclusive race categories into mutually exclusive race categories.
This post was helpful using baseR, but I was wondering if there was a tidy version to do it: Recoding non-mutually exclusive variables into mutually exclusive variables This post was almost helpful, but not fully applicable: Combining different dummy variables into a single categorical variable based on conditions (mutually exclusive categories)?
Here is an example of how the data is set up:
'1' means the participant checked off this answer. '0' means the participant did NOT check off the answer.
I wanted to make a new var called 'RaceCount', which adds up the occurrences of 1's, and if the value is ABOVE 1, then that value could be counted as a new value called 'Multiracial'. If the count is just '1', then the value would just match what the participant answered. Is there any way to use this using tidy?
I tried something like this, but I keep getting really odd counts:
df <- df %>%
group_by(White, Asian, Black) %>% mutate(RaceCount = n())
I am envisioning something like this:
ID White Black Asian RaceCount
1 1 0 0 White
2 1 0 0 White
3 1 0 0 White
4 1 0 1 Multiracial
5 1 0 1 Multiracial
6 0 0 1 Asian
7 0 1 0 Black
8 0 1 0 Black
9 1 0 0 White
10 0 1 0 Black
Here is one potential tidyverse approach:
library(tidyverse)
df <- structure(list(ID = 1:10, White = c(1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 1L, 0L),
Black = c(0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 1L),
Asian = c(0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 0L, 0L)),
class = "data.frame", row.names = c(NA, -10L))
df %>%
mutate(RaceCount = case_when(White + Asian + Black > 1 ~ "Multiracial",
White == 1 ~ "White",
Black == 1 ~ "Black",
Asian == 1 ~ "Asian"))
#> ID White Black Asian RaceCount
#> 1 1 1 0 0 White
#> 2 2 1 0 0 White
#> 3 3 1 0 0 White
#> 4 4 1 0 1 Multiracial
#> 5 5 1 0 1 Multiracial
#> 6 6 0 0 1 Asian
#> 7 7 0 1 0 Black
#> 8 8 0 1 0 Black
#> 9 9 1 0 0 White
#> 10 10 0 1 0 Black
Created on 2024-12-19 with reprex v2.1.0