rdplyrreplacemergecsplit

Merge columns with multiple delimiters


how can I merge columns with variable number of delimiters so I can obtain something like the output (assuming everything is a character)?

 dt1
   letter
1       a
2     b+c
3       c
4 d+e+f+g
5   a+g+e

 dt2
  letter number
1      a      1
2      b      2
3      c      3
4      d      4
5      e      5
6      f      6
7      g      7

> output
   letter  number
1       a       1
2     b+c     2+3
3       c       3
4 d+e+f+g 4+5+6+7
5   a+g+e   1+7+5
dt1<-data.frame(letter=c("a","b+c","c","d+e+f+g","a+g+e"))

dt2<-data.frame(letter=c("a","b","c","d","e","f","g"),number=c("1","2","3","4","5","6","7"))

output<-data.frame(letter=c("a","b+c","c","d+e+f+g","a+g+e"), number=c("1","2+3","3","4+5+6+7","1+7+5"))

Solution

  • A base R solution can be,

    dt1$res <- sapply(strsplit(dt1$letter, '+', fixed = TRUE), function(i)paste(dt2$number[dt2$letter %in% i], collapse = '+'))
    
    #   letter     res
    #1       a       1
    #2     b+c     2+3
    #3       c       3
    #4 d+e+f+g 4+5+6+7
    #5   a+g+e   1+5+7