I'm trying to use Snowflakes regex implementation, which I have just discovered is POSIX BRE/ERE. I had previously fashioned a regex expression to allow me to identify all commas not in double quoted string sections with a custom delimiter (for text file parsing).
Sample text string:
"Foreign Corporate Name Registration","99999","Valuation Research",,"Active Name",02/09/2020,"02/09/2020","NEVADA","UNITED STATES",,,"123 SOME STREET",,"MILWAUKEE","WI","53202","UNITED STATES","123 SOME STREET",,"MILWAUKEE","WI","53202","UNITED STATES",,,,,,,,,,,,
Regex command and substitution (working in regex101.com):
([("].*?["])*?(,)
\1#^#
Regex101.com (and desired) result:
"Foreign Corporate Name Registration"#^#"99999"#^#"Valuation Research"#^##^#"Active Name"#^#02/09/2020#^#"02/09/2020"#^#"NEVADA"#^#"UNITED STATES"#^##^##^#"123 SOME STREET"#^##^#"MILWAUKEE"#^#"WI"#^#"53202"#^#"UNITED STATES"#^#"123 SOME STREET"#^##^#"MILWAUKEE"#^#"WI"#^#"53202"#^#"UNITED STATES"#^##^##^##^##^##^##^##^##^##^##^##^#
So, given that I am now belatedly discovering that I cannot use lazy quantifiers, can any uber-regex'ers advise on how I might alter my expression to return the same result while being compliant with POSIX BRE/ERE?
You need to
[("]
matches (
or "
, you need to only match "
with this character class, use "
only.The final POSIX ERE expression will look like
("[^"]*")*(,)
It matches
("[^"]*")*
- zero or more occurrences of "
, one or more chars other than "
and then a "
(Group 1)(,)
- a comma (Group 2)NOTE: POSIX BRE expression will look like \("[^"]*"\)*\(,\)
where capturing groups are defined with a pair of escaped parentheses.