excelerror-handlingvba7

Why does VBA RegExp Remove Thousands of '/', then hang on cell with '/' - Run-time error '1004?


I'm new to VBA but I've done a bit of scripting in the past. More, I'm fairly confident with regular expressions, so this one is throwing me.

I'm attempting to import a subset of information form one .xlsx file into another spreadsheet. Some of the data is 'polluted' with unwanted characters, and I'm going to need to extract substrings from some cells, so I enabled 'Microsoft VBScript Regular Expressions 5.5' in the VBA editor.

I wrote a loop to go through the imported data and clean out some of the unwanted characters:

'remove special characters on the temporary sheet
Set SrchSht = SdsFile.Sheets("New IO")      'set sheet to search
Set SrchStrt = SrchSht.Cells(1, 1)          'start of search
Set SrchEnd = SrchSht.Cells(RowLim, ColLim) 'end of search
Set SrchRange = SrchSht.Range(SrchStrt.Address & ":" & SrchEnd.Address)       'full search range
'RegExSrchPat = "[^\.\w\(\) ,-:]+"      'match one or more characters not [^] in the brackets
RegExSrchPat = "[¶/]+"      'match one or more pillcrows or forward slashes
RegExRepPat = "" 
For Each C In SrchRange
    If RegExSrchPat <> "" Then
        SrchString = CStr(C.Value)
        With RegEx
            .Global = True
            .MultiLine = True
            .IgnoreCase = True
            .Pattern = RegExSrchPat
        End With
        If RegEx.Test(SrchString) Then
            C.Value = RegEx.Replace(SrchString, RegExRepPat) 
        End If
    End If
Next

(RowLim and ColLim are set earlier in the script and have values of 3899 and 12, respectively, in case you're wondering.)

You'll note that RegExSrchPat is set twice and commented out once. In the first (commented) instance I attempted to define the characters I wanted to keep and match/remove those that didn't fit the pattern. The exclusive pattern seemed to work well until I noticed it wasn't removing any forward slashes. All of the other characters I wanted gone were good and gone. (I had to add a few to the pattern that I was mistakenly deleting.)

I eventually concluded that all I needed to remove were pilcrows and forward slashes, so maybe a positive matching pattern would do the trick. Thus, pattern number two.

And it worked, mostly. For 2435 rows it matched and removed the forward slashes that the other pattern left behind. Got the pilcrows too. But then it hung on row 2436 in column A with a cell value of '=+MA-30509PL:1/PE', giving the message "Run-time error '1004: Application-defined or object-defined error".

Now I'm already using CStr() to evaluate the cell contents as a string, so I thought changing the format of the cell in question from General to Text wouldn't do much … and I was wrong. Doing that allows the script to resume, remove the forward slash in A2436 and complete it's appointed task like it's Tuesday. But I can't expect the person using this script to figure that out, so this isn't a solution. It's just a (confusing) data point.

Am I missing something in the exclusive RegEx pattern that's leaving forward slashes behind? Is there something in the inclusive ReGex pattern that's causing the script to hang on the RegEx.Replace line? Is there a way to quickly switch the format of all cells in SrchRange from General to Text to bypass this little error?

Or, do you have any better ideas? I'm new to VBA.


Solution

  • You ask about a reliable way to make the problem disappear. In the actual question, you had in essence answered that question by saying formatting that cell as text helped. You have already set the range, now select it and use something like Selection.NumberFormat = "@" to convert enmass to text.