regexgsubingest

lookaround in POSIX regex to match all spaces except the last (for gsub)


...freaking out because of this simple problem:

I'm using an Ingest pipeline with the gsub processor to replace all (white)spaces except the last. E.g.:

"hello world regex is fubar " to result in "hello, world, regex, is, fubar"

How can I convert the PCRE syntax (which won't work gsub TRE patterns, as I found out)

"/\s(?=.\S*)/g"

To POSIX, like...

"/[[:space:]](?=.[[:space:]]*)/g"

(only spaces exchanged, not the lookaround)

Edit: As I can only provide the regex in a string, I cannot use another processor than gsub. '\s' or '\S' are apparently marked as "unknown".


Solution

  • Worked using " +([^ ])" - another solution would be " +(.)". (Both without the double quotes)

    with the replacement/substitution string ,$1.

    Thanks to Wiktor Stribiżew for pointing this out.

    For whatever reason the POSIX literal [:space] does not work, why [[:space:]]+(.) did not work either, even tho it is a correct regex.