excelexcel-formulaexcel-2010vba

Extracting digits from a cell with varying char length


I have a group of cells, the first of the string never changes, it is and always will (until the coder changes it) 20 characters (inc spaces).

I then want to extract the 3 numbers (and in some cases 2) from the remaining sequence.

The monthly cost is 2 silver, 1 copper and 40 iron.
The monthly cost is 1 silver, 94 copper and 40 iron.
The monthly cost is 1 silver and 75 copper.
The monthly cost is 8 silver and 40 copper.
The monthly cost is 1 silver.
The monthly cost is 99 silver, 99 copper and 99 iron.
The monthly cost is 1 gold.

In the sample above you can see that there is no set value after the first 20 chars.

1 or 99 silver 
1 or 99 copper
0, 1 or 99 iron  

I can't get a sequence that gets all the cells correct, I've tried the following:

=IF(J7<>1,(MID(TRIM(J7),FIND(" iron",TRIM(J7))-2,FIND(" iron",TRIM(J7))-FIND(" iron",TRIM(J7))+3)),"")    
results in:  #VALUE!  (when no iron)  

=TRIM(MID(J6,FIND(" silver",J6)-2,LEN(J6)-FIND(" silver",J6)-26))&TRIM(MID(J6,FIND(" copper",J6)-2,LEN(J6)-FIND(" copper",J6)-16))&TRIM(MID(J6,FIND(" iron",J6)-2,LEN(J6)-FIND(" iron",J6)-3))  
results in:  1 s9440   

=MID(J7,31,2-ISERR(MID(J7,21,1)+0))  
results in:  nd

If I & the cells as part of the calculation, they then don't calculate in the next mathematical step as I've had to allow for spaces in my code, in the case that there may be 2 digit numbers, not single.

=MID(J5,SEARCH(" silver",J5,1)-2,2)&MID(J5,SEARCH(" copper",J5,1)-2,2)&MID(J5,SEARCH(" iron",J5,1)-2,2)  
results:   2 140
not:       2140

What I need to end up with is:

2140  
19440  
175  
840  
1  
999999   

Many thanks in advance.


Solution

  • When it comes to pattern matching in strings, RegEx if often the way to go.

    In Excel, this requires a VBA solution, using a reference to "Microsoft VBScript Regular Expresions 5.5" (you can go late bound if you prefer)

    Here's a starter for your case, as a UDF

    Use it as a formula like =GetValues(A1) assuming 1st raw data is in A1. Copy down for as many rows as required

    This will extract up to 3 values from a string.

    Function GetValues(r As Range) As Variant
        Dim re As RegExp
        Dim m As MatchCollection
        Dim v As Variant
        Dim i As Long
        Set re = New RegExp
    
        re.Pattern = "(\d+)\D+(\d+)\D+(\d+)"
        If re.test(r.Value) Then
            Set m = re.Execute(r.Value)
        Else
            re.Pattern = "(\d+)\D+(\d+)"
            If re.test(r.Value) Then
                Set m = re.Execute(r.Value)
            Else
                re.Pattern = "(\d+)"
                If re.test(r.Value) Then
                    Set m = re.Execute(r.Value)
                End If
            End If
        End If
        If m Is Nothing Then
            GetValues = vbNullString
        Else
            For i = 0 To m.Item(0).SubMatches.Count - 1
                v = v & m.Item(0).SubMatches(i)
            Next
            GetValues = v
        End If
    End Function