regexvbams-access

Extracting a numeric value from the middle of a String field in MS Access 2016


Our department uses a Microsoft Access database to manage roughly 300,000 records. The front-end of the system allows users to add comments to a donor's record. As you might guess, the comment field became the wild west with no standards for how comments were input. To make matters worse, prior to me starting here, in an instance where a donor already had a comment assigned to their record, rather than creating a new comment (a new line in the back-end table), the staff would open the existing comment and add additional notes to the end. ONLY DELIMITED BY A SPACE!

One of the instances where a comment would be added is if a duplicate record was found. The duplicate would be disabled with a note listing the original records ID for reference.

The time has come for us to migrate to a new system and I would love to pull these ID numbers out of the comment field so they can be added to a new table and used to link the original and duplicate records together before we import unnecessary data into a new database.

The problem I'm running into is; how do I write a query to extract the referenced ID value from the middle of a comments field that is a text field when there was no ? All IDs are 7 digits long. Below are a couple examples of how the roughly 4,000 instances of these comments were input.

In searching forums, I was able to build a public function that I could reference in a query but it pulls in all numeric values in the string, not just those that are in a string of 7 consecutive digits?

Public Function ExtractDonorID(strInput) As String
    Dim strResult As String, strCh As String
    Dim z As Integer
    If Not IsNull(strInput) Then
        For z = 1 To Len(strInput)
            strCh = Mid(strInput, z, 1)
            Select Case strCh
                Case "0" To "9"
                    strResult = strResult & strCh
                Case Else
            End Select
        Next z
    End If
    ExtractDonorID= strResult
End Function

Any Ideas?


Solution

  • Something like this maybe, but a Regexp might be better:

    Sub Tester()
        Dim v
        For Each v In Array("dup see 1189827, Rita is Mrs. White", _
                            "dup see 11898278, Rita is Mrs. White", _
                            "4-16-18 d DUPE - 1344126", _
                            "4-16-18 d DUPE - 134412")
            Debug.Print v, vbLf, "Found: " & ExtractNumericId(v)
        Next v
    End Sub
    
    
    Public Function ExtractNumericId(ByVal strInput) As String
        Const NUM_DIGITS As Long = 7 'this many and no more....
        Dim txt As String, patt As String
        Dim z As Long, l As Long
        
        If Not IsNull(strInput) Then
            l = Len(strInput)
            patt = String(NUM_DIGITS, "#") 'checking for x digits...
            strInput = strInput & " "      'kludge in case digits are at the end of the input
            z = 1
            Do While z <= l - (NUM_DIGITS - 1)
                txt = Mid(strInput, z, NUM_DIGITS)
                If txt Like patt Then
                    'check next character is not also a digit...
                    If Not Mid(strInput, z + NUM_DIGITS, 1) Like "#" Then
                        ExtractNumericId = txt
                        Exit Function
                    Else
                        z = z + NUM_DIGITS 'skip this section...
                    End If
                End If
                z = z + 1
            Loop
        End If
    End Function
    

    The original accepted code had a bug - this is more robust:

    Public Function ExtractNumericId2(ByVal strInput) As String
        Const NUM_DIGITS As Long = 7 'this many and no more....
        Dim txt As String
        Dim z As Long, l As Long, ch As String
        
        strInput = strInput & " " 'kludge in case id is at the end of the input
        For z = 1 To Len(strInput)
            ch = Mid(strInput, z, 1)
            If ch Like "#" Then 'is a digit?
                txt = txt & ch  'append to potential id
            Else
                If Len(txt) = NUM_DIGITS Then 'just ended a NUM_DIGITS run?
                    ExtractNumericId2 = txt    'return the id
                    Exit Function
                Else
                    txt = ""   'reset potential id
                End If
            End If
        Next z
    End Function