excelvba

Excel VBA - Comparing 2 Comma-Separated Number Strings - Appending Non-Overlapping Numbers From 2nd String


I've been reviewing multiple questions and have not come across my exact issue. Fairly new to VBA, but generally have been able to muddle through if I know where to start. So, I'm not looking for anyone to hand me a solution (although any assistance is appreciated). Here's what I'm trying to do. I have two comma-separated strings (String 1 and String 2) of generally 5 to 10 numbers. String 2 is generally a continuation of "some" portion of String 1. As an example let's assume String 1 is 4,15,30,22,60,54,21 and String 2 is 22,60,54,21,2,10,6. So, in my example 22,60,54,21 overlaps a portion of String 1. What I would like to end up with is a combined string of 4,15,30,22,60,54,21,2,10,6 such that the overlapping portion of the number string is not duplicated. The overlap could be variable, meaning on occasion it could be 1-2 numbers and another could be 4-6 numbers.

I have tried to accomplish this by using a bunch of helper columns or rows with some success, but I would want to employ a solution over more than two cell sets. In my research it "seems" as though the correct area would be StrComp and/or InStr. Again, thanks to anyone willing to give a bit of direction.


Solution

  • Here is a different way of achieving your goal, that I think more closely resembles how a human might do it:

    Sub Remove_overlapping_duplicates()
    
     Dim str1 As String, str2 As String, str3 As String, strlen as long, n As Long
     str1 = "4,15,30,22,60,54,21"
     str2 = "22,60,54,21,2,10,6"
     str3 = str1 & "," & str2
     
     'get shortest string
     strlen = Len(str1)
     If Len(str2) < strlen Then strlen = Len(str2)
     
     'loop through
     For n = strlen To 1 Step -1
        If Left(str2, n) = Right(str1, n) Then
           str3 = str1 & Mid(str2, n + 1)
           Exit For
        End If
     Next
     
     'output answer
     MsgBox str3
     
    End Sub
    

    It starts with the shortest of the two string lengths, and starts attempting to match the entirety of str2 against str1. It then repeats, shortening the length each time until a match is found. The first match found with the most characters wins through and the output is produced. If no match is found at all, str1 & str2 are output.

    You can see it working here:

    4,15,30,22,60,54, | ,15,30,22,60,54,21 | 
                      | 22,60,54,21,2,10,6 | 
    
    4,15,30,22,60,54 | 15,30,22,60,54,21 | 
                     | 22,60,54,21,2,10, | 6
    
    4,15,30,22,60,5 | 5,30,22,60,54,21 | 
                    | 22,60,54,21,2,10 | ,6
    
    4,15,30,22,60, | ,30,22,60,54,21 | 
                   | 22,60,54,21,2,1 | 0,6
    
    4,15,30,22,60 | 30,22,60,54,21 | 
                  | 22,60,54,21,2, | 10,6
    
    4,15,30,22,6 | 0,22,60,54,21 | 
                 | 22,60,54,21,2 | ,10,6
    
    4,15,30,22, | ,22,60,54,21 | 
                | 22,60,54,21, | 2,10,6
    
    4,15,30,22 | 22,60,54,21 | 
               | 22,60,54,21 | ,2,10,6    << match found