[SOLVED] Non-mutually exclusive race categories into mutually exclusive categories in R using Dplyr

Non-mutually exclusive race categories into mutually exclusive categories in R using Dplyr

I was interested in recoding non-mutually exclusive race categories into mutually exclusive race categories.

This post was helpful using baseR, but I was wondering if there was a tidy version to do it: Recoding non-mutually exclusive variables into mutually exclusive variables This post was almost helpful, but not fully applicable: Combining different dummy variables into a single categorical variable based on conditions (mutually exclusive categories)?

Here is an example of how the data is set up:

'1' means the participant checked off this answer. '0' means the participant did NOT check off the answer.

I wanted to make a new var called 'RaceCount', which adds up the occurrences of 1's, and if the value is ABOVE 1, then that value could be counted as a new value called 'Multiracial'. If the count is just '1', then the value would just match what the participant answered. Is there any way to use this using tidy?

I tried something like this, but I keep getting really odd counts:

df <- df %>% 
  group_by(White, Asian, Black) %>% mutate(RaceCount = n())

I am envisioning something like this:

ID White Black Asian RaceCount
1   1      0      0      White
2   1      0      0      White
3   1      0      0      White
4   1      0      1      Multiracial
5   1      0      1      Multiracial
6   0      0      1      Asian
7   0      1      0      Black
8   0      1      0      Black
9   1      0      0      White
10  0      1      0      Black

Solution

Here is one potential tidyverse approach:

library(tidyverse)

df <- structure(list(ID = 1:10, White = c(1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 1L, 0L),
                     Black = c(0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 1L),
                     Asian = c(0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 0L, 0L)), 
                class = "data.frame", row.names = c(NA, -10L))
df %>%
  mutate(RaceCount = case_when(White + Asian + Black > 1 ~ "Multiracial",
                               White == 1 ~ "White",
                               Black == 1 ~ "Black",
                               Asian == 1 ~ "Asian"))
#>    ID White Black Asian   RaceCount
#> 1   1     1     0     0       White
#> 2   2     1     0     0       White
#> 3   3     1     0     0       White
#> 4   4     1     0     1 Multiracial
#> 5   5     1     0     1 Multiracial
#> 6   6     0     0     1       Asian
#> 7   7     0     1     0       Black
#> 8   8     0     1     0       Black
#> 9   9     1     0     0       White
#> 10 10     0     1     0       Black

^{Created on 2024-12-19 with reprex v2.1.0}