I am trying to isolate street address fields that begin with a digit, contain an underscore and end with a comma:
001 ALLAN Witham Ross 13 Every_Street, Welltown Greenkeeper 002 ALLARDYCE Margaret Isabel 49 Bell_Road, Musicville Housewife 003 ALLARDYCE Mervyn George 49 Bell_Road, Musicville Company Mngr
e.g
13 Every_Street, Welltown
49 Bell_Road, Musicville
49 Bell_Road, Musicville
My regex is
(?ms)([0-9]+\s[A-Z][a-z].+(?=,))
But this matches 13 through to the last 'd' of Bell_Road. Which is almost everything. See regex101 example
This matches two commas but not the third? I want it to match up to the next comma. But do it three times :)
This produces your desired matches:
\d+[^,\d]*_[^,]+, \S+
demo
They don't end with a comma, tho.
For that you could just remove \S+
at the end.