rdummy-variablereformatting

Reconstruct a categorical variable from dummies in R


Heyho, I am a beginner in R and have a problem to which I couldn't find a solution so far. I would like to transform dummy variables back to categorical variables.

|dummy1| dummy2|dummy3|
|------| ------|------|
| 0    | 1     |0     |
| 1    | 0     |0     |
| 0    | 1     |0     |
| 0    | 0     |1     |

into:

|dummy |
|------|
|dummy2|
|dummy1|
|dummy2|
|dummy3|

Do you have any idea how to do that in R? Thanks in advance.


Solution

  • You can do this with data.table

    id_cols = c("x1", "x2") 
    data.table::melt.data.table(data = dt, id.vars = id_cols, 
                                na.rm = TRUE, 
                                measure = patterns("dummy"))
    

    Example:

    t = data.table(dummy_a = c(1, 0, 0), dummy_b = c(0, 1, 0), dummy_c = c(0, 0, 1), id = c(1, 2, 3))
    data.table::melt.data.table(data = t, 
                                id.vars = "id", 
                                measure = patterns("dummy_"), 
                                na.rm = T)[value == 1, .(id, variable)]
    

    Output

       id variable
    1:  1  dummy_a
    2:  2  dummy_b
    3:  3  dummy_c
    

    It's even easier if you remplaze 0 by NA, so na.rm = TRUE in melt will drop every row with NA