rdplyrrowwise

Checking whether sets of columns are the same, row wise in R, in any order


I am working in R, and would prefer a dplyr solution if possible.

sample data:

data.frame(
  col1 = c("a", "b", "c", "d"),
  col2 = c("a", "b", "d", "a"),
  col3 = rep("a", 4L),
  col4 = c("a", "b", "d", "a"),
  col5 = c("a", "a", "c", "d"),
  col6 = rep(c("b", "a"), each = 2L)
)
col1 col2 col3 col4 col5 col6
a a a a a b
b b a b a b
c d a d c a
d a a a d a

Question

I would like to know for each row, whether col1, col2 and col3 are the same as col4, col5 and col6, but the order of col1 - col3 and col4 - col6 should be ignored.

So for row 1, if col1 - col3 contained a,a,b respectively, and col4 - col6 contained b,a,a respectively, then that would be considered a match.

Desired result

Have put a note on "assessment" column to aid understanding

col1 col2 col3 col4 col5 col6 assessment
a a a a a b FALSE (because 1-3 are not same as 4-6)
b b a b a b TRUE (because 1-3 are the same as 4-6, if ignore order)
c d a d c a TRUE (because 1-3 are the same as 4-6, if ignore order)
d a a a d a TRUE (because 1-3 are the same as 4-6, if ignore order)

Solution

  • Using dplyr you can do the following:

    df %>%
      rowwise() %>%
      mutate(result = all(sort(c_across(col1:col3)) == sort(c_across(col4:col6))))
    
    # A tibble: 4 × 7
    # Rowwise: 
      col1  col2  col3  col4  col5  col6  result
      <chr> <chr> <chr> <chr> <chr> <chr> <lgl> 
    1 a     a     a     a     a     b     FALSE 
    2 b     b     a     b     a     b     TRUE  
    3 c     d     a     d     c     a     TRUE  
    4 d     a     a     a     d     a     TRUE