pythonregex

Regex: Substitute pattern in string multiple times without leftovers


I'm trying to write a regex to match the pattern (##a.+?#a##b.+?#b) in strings where it might appear more than one time:

foo##abar#a##bfoo#bbar##afoo#a##bbar#bfoobar

What I would like to do is to substitute the whole string with just the pattern, so I've tried the following using regex101 (python flavor):

Regex:

.*?(##a.+?#a##b.+?#b).*?

The problem is that, if the last occurrence of the pattern is followed by some text, as in the example above, the last portion is not matched, giving:

##abar#a##bfoo#b##afoo#a##bbar#bfoobar 

Expected output:

##abar#a##bfoo#b##afoo#a##bbar#b (just the matched pattern w/out extra text)

Is there a workaround for this kind of task? Many thanks in advance for any suggestion.


Solution

  • You could add |$ in the group and then only match other characters before the group. That way you solve the issue of not getting rid of the suffix:

    .*?(##a.+?#a##b.+?#b|$)
    

    See it on regex101