I am trying to match a regex pattern across multiple lines. The pattern begins and ends with a substring, both of which must be at the beginning of a line. I can match across lines, but I can't seem to specify that the end pattern must also be at the beginning of a line.
Example string:
Example=N ; Comment Line One error=
; Comment Line Two.
Desired=
I am trying to match from Example=
up to Desired=
. This will work if error=
is not in the string. However, when it is present I match Example=N ; Comment Line One error=
config_value = 'Example'
pattern = '^{}=(.*?)([A-Za-z]=)'.format(config_value)
match = re.search(pattern, string, re.M | re.DOTALL)
I also tried:
config_value = 'Example'
pattern = '^{}=(.*?)(^[A-Za-z]=)'.format(config_value)
match = re.search(pattern, string, re.M | re.DOTALL)
Your pattern contains .*?
and your options include re.DOTALL
(also, re.S
is its equivalent) that makes the .
match newlines, too. For those of you who wonder why your regex does not match across multiple lines, check these first:
.
without re.S
/re.DOTALL
flags (or their inline equivalent (?s)
). As an example, you can't expect a match in re.search(r'a(.*?)b', '__ a\nb __')
, but you will find a match in re.search(r'a(.*?)b', '__ a\nb __', re.DOTALL)
or re.search(r'(?s)a(.*?)b', '__ a\nb __')
'__ a\nb __'
, paste it there, and replace the \n
with the real literal line breakre.S
/re.DOTALL
with re.M
/re.MULTILINE
. The latter is used to re-define the behavior of the ^
and $
anchors only. It means the ^
will match the start of any line and the $
will match the end of any line if you use re.M
/re.MULTILINE
. It does not mean your regex will automatically start finding matches that span across multiple lines, please mind this.More references:
Now, for this concrete case in the OP, you may use
config_value = 'Example'
pattern=r'(?sm)^{}=(.*?)(?=[\r\n]+\w+=|\Z)'.format(config_value)
match = re.search(pattern, s)
if match:
print(match.group(1))
See the Python demo.
Pattern details
(?sm)
- re.DOTALL
and re.M
are on^
- start of a lineExample=
- a substring(.*?)
- Group 1: any 0+ chars, as few as possible(?=[\r\n]+\w+=|\Z)
- a positive lookahead that requires the presence of 1+ CR or LF symbols followed with 1 or more word chars followed with a =
sign, or end of the string (\Z
).See the regex demo.