rdataframemembership

Check whether values in one data frame column exist in a second data frame


I have two data frames (A and B), both with a column 'C'. I want to check if values in column 'C' in data frame A exists in data frame B.

A = data.frame(C = c(1,2,3,4))
B = data.frame(C = c(1,3,4,7))

Solution

  • Use %in% as follows

    A$C %in% B$C
    

    Which will tell you which values of column C of A are in B.

    What is returned is a logical vector. In the specific case of your example, you get:

    A$C %in% B$C
    # [1]  TRUE FALSE  TRUE  TRUE
    

    Which you can use as an index to the rows of A or as an index to A$C to get the actual values:

    # as a row index
    A[A$C %in% B$C,  ]  # note the comma to indicate we are indexing rows
    
    # as an index to A$C
    A$C[A$C %in% B$C]
    [1] 1 3 4  # returns all values of A$C that are in B$C
    

    We can negate it too:

    A$C[!A$C %in% B$C]
    [1] 2   # returns all values of A$C that are NOT in B$C
    



    If you want to know if a specific value is in B$C, use the same function:

      2 %in% B$C   # "is the value 2 in B$C ?"  
      # FALSE
    
      A$C[2] %in% B$C  # "is the 2nd element of A$C in B$C ?"  
      # FALSE