Is the regex code in VB.net known to be slow?
I took over some code that was cleaning large amounts of text data. The code ran fairly slow, so I was looking for some ways to speed it up. I found a couple functions that got run a lot that I thought might be part of the problem.
Here's the original code for cleaning a phone number:
Dim strArray() As Char = strPhoneNum.ToCharArray
Dim strNewPhone As String = ""
Dim i As Integer
For i = 0 To strArray.Length - 1
If strArray.Length = 11 And strArray(0) = "1" And i = 0 Then
Continue For
End If
If IsNumeric(strArray(i)) Then
strNewPhone = strNewPhone & strArray(i)
End If
Next
If Len(strNewPhone) = 7 Or Len(strNewPhone) = 10 Then
Return strNewPhone
End If
I rewrote the code to eliminate the array and looping using regex.
Dim strNewPhone As String = ""
strNewPhone = Regex.Replace(strPhoneNum, "\D", "")
If strNewPhone = "" OrElse strNewPhone.Substring(0, 1) <> "1" Then
Return strNewPhone
Else
strNewPhone = Mid(strNewPhone, 2)
End If
If Len(strNewPhone) = 7 Or Len(strNewPhone) = 10 Then
Return strNewPhone
End If
After running a couple tests, the new code is significantly slower than the old. Is regex in VB.net slow, did I add some other thing that is the issue, or is the original code just fine the way it was?
I conducted some tests with the Visual Studio Profiler and I did not get the same results you did. There was a logical error is your Regex function that caused the length check to be missed if the number didn't begin with 1
. I corrected that in my tests.
Results
In general my method was always slightly faster.
My Conclusion
In all tests the Original method was much slower. Had it come out better in one test then I be able to explain our discrepancy. Ff you tested those methods in total isolation I think you will come up with something similar.
My best guess is something else was effecting your results and that your assessment that the Original method was better is false.
Your Revised Function
Function GetPhoneNumberRegex(strPhoneNum As String)
Dim strNewPhone As String = ""
strNewPhone = Regex.Replace(strPhoneNum, "\D", "")
If strNewPhone <> "" And strNewPhone.Substring(0, 1) = "1" Then
strNewPhone = Mid(strNewPhone, 2)
End If
If Len(strNewPhone) = 7 Or Len(strNewPhone) = 10 Then
Return strNewPhone
End If
Return ""
End Function
My Function
Function GetPhoneNumberMine(strPhoneNum As String)
Dim strNewPhone As String = Regex.Replace(strPhoneNum, "\D", "")
If (strNewPhone.Length >= 7 And strNewPhone(0) = "1") Then
strNewPhone = strNewPhone.Remove(0, 1)
End If
Return If(strNewPhone.Length = 7 OrElse strNewPhone.Length = 10, strNewPhone, "")
End Function