This is the type of text I'm working with:
* went to the building; opened the door; closed the door; picked up some money ($20)
* walked next door; knocked on a window; purchased an apple pie ($6.95)
* skipped down the street; turned to see where I'd come from; grabbed some burritos ($8); the street is wet from the rain
* installed some plumbing ($23)
Using regex on this example, I'm looking to grab the following:
from line 1: "picked up some money ($20)"
from line 2: "purchased an apple pie ($6.95)"
from line 3: "grabbed some burritos ($8)"
from line 4: "installed some plumbing ($23)"
The unifying factor in all of these is that they follow either a "*" or a ";", have a first word ending in "ed" and have a dollar value at the end in parentheses.
This is the regex I've got so far:
(?=[^;]*$)\w+ed (.+) \((\$\d{1,3}(.\d{2})?)\)
This matches everything correctly from lines 1, 2 and 4, but does not match the section I want on line 3 due to the fact that there is a trailing ";" and further text on the same line.
Any advice on what I can adjust in the regex is greatly appreciated!
You want to avoid going past the next semi colon. Use [^;\r\n]+
between the ed and dollar sign.
(?<= [*;] )
[^\S\r\n]*
( # (1 start)
\w+ ed \b [^;\r\n]+
\( \$ \d+
(?: \. \d* )?
\)
) # (1 end)
https://regex101.com/r/d6e0ve/1
(?<=[*;])[^\S\r\n]*(\w+ed\b[^;\r\n]+\(\$\d+(?:\.\d*)?\))