rmatrixdummy-data

Create a binary indicator matrix (Boolean matrix) in R


I have a list of data indicating attendance to conferences like this:

Event                     Participant  
ConferenceA               John   
ConferenceA               Joe  
ConferenceA               Mary    
ConferenceB               John  
ConferenceB               Ted  
ConferenceC               Jessica  

I would like to create a binary indicator attendance matrix of the following format:

Event        John  Joe  Mary  Ted  Jessica  
ConferenceA  1     1    1     0    0  
ConferenceB  1     0    0     1    0  
ConferenceC  0     0    0     0    1  

Is there a way to do this in R?


Solution

  • Assuming your data.frame is called "mydf", simply use table:

    > table(mydf)
                 Participant
    Event         Jessica Joe John Mary Ted
      ConferenceA       0   1    1    1   0
      ConferenceB       0   0    1    0   1
      ConferenceC       1   0    0    0   0
    

    If there is a chance that someone would have attended a conference more than once, leading table to return a value greater than 1, you can simply recode all values greater than 1 to 1, like this.

    temp <- table(mydf)
    temp[temp > 1] <- 1
    

    Note that this returns a table. If you want a data.frame to be returned, use as.data.frame.matrix:

    > as.data.frame.matrix(table(mydf))
                Jessica Joe John Mary Ted
    ConferenceA       0   1    1    1   0
    ConferenceB       0   0    1    0   1
    ConferenceC       1   0    0    0   0
    

    In the above, "mydf" is defined as:

    mydf <- structure(list(Event = c("ConferenceA", "ConferenceA", 
      "ConferenceA", "ConferenceB", "ConferenceB", "ConferenceC"), 
      Participant = c("John", "Joe", "Mary", "John", "Ted", "Jessica")), 
      .Names = c("Event", "Participant"), class = "data.frame", 
      row.names = c(NA, -6L))
    

    Please share your data in a similar manner in the future.