I want to match the closing quote together with the opening quote of the following string if both are on the same line. Two strings may be separated either by a blank
or a blank-plus-blank +
.
Regex engine: Python
F.i. from
this is "some string" "; which should match" 234
"and this" + "should also match\"" "\"and this"
but not this: " " a " + "
I'd like to see matches for:
" "
from between some string
and ; which...
" + "
from between and this
and should also match\"
" "
from between should also match\"
and \"and this
So in fact, I think it might be best to only match the groups " "
and " + "
if there is an odd number of quotes before and after the group. Since lookbehing/ahead is fixed length only, I didn't find a good way to do it.
I tried
re.compile(r'(" \+ ")|(" ")(?!;|,)')
but this assumes that there may be no semicolon within a string
and also
re.compile(r'"[^"]+")
but this only finds the strings themselves, but not the "inter-string" quotes.
Here's the character loop parsing method I mentioned above. I track whether we are inside a quote or not, and I track the characters between quotes.
data = """\
this is "some string" "; which should match" 234
"and this" + "should also match\\"" "\\"and this"
but not this: " " a " + "
"""
def check(line):
in_quotes = False
between = "xxxx"
found = []
escape = False
for c in line:
if escape:
escape = False
elif c == '"':
if not in_quotes and between in (' ', ' + '):
found.append( between )
between = ""
in_quotes = not in_quotes
elif c == '\\':
escape = True
elif not in_quotes:
between += c
return found
for line in data.splitlines():
print(line)
matches = check(line)
print(matches)
Output:
this is "some string" "; which should match" 234
[' ']
"and this" + "should also match\"" "\"and this"
[' + ', ' ']
but not this: " " a " + "
[]