rstringunique

how to find unique characters both in forward and backward order in R


I have a list of characters like this:

list <- c('a_b', 'a_c', 'a_d', 'a_e', 'a_b', 'b_a', 'b_c', 'b_c','c_b')

I want to have a list of unique characters with no more 'b_a', 'c_b'. I have tried unique() but it cannot remove 'b_a' and 'c_b'. I hope to receive some help about this. Many thanks!


Solution

  • You could use strsplit() to split the two characters apart, then sort them in alphabetical order and paste them back together. That will turn "b_a" into "a_b". Then you could get the unique values of the sorted strings.

    l <- c('a_b', 'a_c', 'a_d', 'a_e', 'a_b', 'b_a', 'b_c', 'b_c','c_b')
    
    ll <- strsplit(l, "_")
    ll <- sapply(ll, \(x)paste(sort(x), collapse="_"))
    unique(ll)
    #> [1] "a_b" "a_c" "a_d" "a_e" "b_c"
    

    Created on 2025-02-05 with reprex v2.1.1