[SOLVED] Regex: Match all characters between two strings

Regex: Match all characters between two strings

Example: In the Netherlands, peanut butter is called "pindakaas" (peanut cheese) rather than "pindaboter" (peanut butter) because the word butter is only supposed to be used with products that contain actual butter.

I want to match everything between cheese and butter and viceversa.

Goals:

butter is called "pindakaas" (peanut cheese
cheese) rather than "pindaboter" (peanut butter

EDIT: Language used is Python 3.7 and current reg-exp I'using is cheese(.*?)butter.

Solution

If you install the regex package from the PyPI repository, then you can do overlapped searches:

import regex as re

text = 'In the Netherlands, peanut butter is called "pindakaas" (peanut cheese) rather than "pindaboter" (peanut butter) because the word butter is only supposed to be used with products that contain actual butter.'

l = re.findall(r'\bbutter\b.*?\bcheese\b|\bcheese\b.*?\bbutter\b', text, overlapped=True)
print(l)

Prints:

['butter is called "pindakaas" (peanut cheese', 'cheese) rather than "pindaboter" (peanut butter']

I used your basic regex but required butter and cheese to be on word boundaries, e.g. \bbutter\b, by placing \b before and after the words. Feel free to remove or not.