I have this Regex expression.
^BRN.*?(?:paid|to)\s([A-Za-z\s]+)\b(?<!\bself)
I want to it return the words after the required pattern, but only is certain words are not found. If they are found, then the Regex shouldn't return anything. Thus,
BRN CLG-CI IQ PAID IONANDA PAUL
should return IONANDA PAUL, which it does. So it's correct there. But I want
BRN-TO CASH SELF
to return a null string or essentially it matches but returns no output. Currently, the regex returns this CASH\s, the \s means a whitespace is included in the output. I tried negative lookbehind but I am still looking for how to just not return anything, if the word is found. Thanks!
Note that your regex captures the CASH
in BRN-TO CASH SELF
with ([A-Za-z\s]+)\b
because once the word boundary is reached after SELF
, the negative lookbehind triggers backtracking, and the regex engine re-matches the string and starts yielding char after char while stepping back along the string to eventually find the word boundary position right before SELF
where no SELF
as whole word is present immediatelty to the left of that location, and that is a valid match.
You can use a negative lookahead after \s
:
^BRN.*?(?:paid|to)\s(?![A-Za-z\s]*\bself\b)([A-Za-z\s]+)
# ^^^^^^^^^^^^^^^^^^^^^^^
See the regex demo.
Now, right after matching the whitespace after paid
or to
, the negative lookahead check will be triggered once, and if there is a whole word self
after any zero or more ASCII letters or whitespace chars, the whole match will fail, else, it will succeed.