[SOLVED] Covert dummy variables to single categorical in R?

Covert dummy variables to single categorical in R?

Similar questions have been asked here, here, and here. However, they don't seem to cover exactly what I need. For example, if I have a dataset like so:

df <- data.frame(
  x = rnorm(10),
  y = rnorm(10),
  a = c(0,0,0,1,1,0,0,0,1,0),
  b = c(1,1,1,1,0,0,1,0,0,0),
  c = c(0,1,0,1,0,0,0,0,0,0),
  z = c(1,1,1,1,1,0,1,0,1,0)
)

What I'm trying to do is convert the variables a, b, and c to a single categorical where the levels are a, b, and c. But as you can see, sometimes 2 variables occur in the same row. So, what I'm trying to achieve is a data frame that would look something like this:

df <- data.frame(
  x = rnorm(10),
  y = rnorm(10),
  a = c(0,0,0,1,1,0,0,0,1,0),
  b = c(1,1,1,1,0,0,1,0,0,0),
  c = c(0,1,0,1,0,0,0,0,0,0),
  z = c(“b”,“b,c”,“b”,“a,b,c”,“a”,0,“b”,0,“a”,0)
)

I tried using :

apply(df[,c("a","b", "c")], 1, sum, na.rm=TRUE)

which sums the amount of each variable... but I'm not sure how to combine 2 (or more) variables into a single factor level!?

Any suggestions as to how I could do this?

Solution

Loop over the selected columns by row (MARGIN = 1), subset the column names where the value is 1 and paste them together

df$z <-  apply(df[c('a', 'b', 'c')], 1, function(x) toString(names(x)[x ==1]))
df$z
#[1] "b"       "b, c"    "b"       "a, b, c" "a"       ""        "b"       ""        "a"       ""

If we want to change the "" to '0'

df$z[df$z == ''] <- '0'

For a solution with purrr and dplyr:

df %>% mutate(z = pmap_chr(select(., a, b, c), ~  {v1 <- c(...); toString(names(v1)[v1 == 1])}))