sedapplescript

does SED not perform expression processing from left to right?


I've been trying the following SED statement (sedex1) in seddy.dev: sed '/^$/d ; s/^[ \t]*// ; s/[ \t]*$// ; /::/N ; s/::/: / ; s/\n//'

and the output I get differs depending on whether I do it in one SED command or break it up and use two, first half to strip blanks lines and excess whitespace, second half to join the lines with minor substitutions to get the desired end result.

My end goal is to use this in an Applescript in macOS. Our team needs to process text like this many times a day (from client registration forms) and we need a fast and painless way to do it. we’re thinking pbcopy and pbpaste but obviously we'll get to that once we have the SED thing under control.

Using that SED command (sedex1) on this input text (in1):

thingOne::
some text for thingOne

thingTwo::
    some text for thingTwo

thingThree::

some text for thingThree

thingFour::

    some text for thingFour

I get this output text (out1):

thingOne: some text for thingOne
thingTwo:     some text for thingTwo
thingThree: 
some text for thingThree
thingFour: 
some text for thingFour

but what I'm trying to get is this output (wantedText):

thingOne: some text for thingOne
thingTwo: some text for thingTwo
thingThree: some text for thingThree
thingFour: some text for thingFour

If I use the same SED patterns but in two stages then everything behaves as expected. specifically, running this (sedex2): sed '/^$/d ; s/^[ \t]*// ; s/[ \t]*$//'

on in1 produces this output (out2) :

thingOne::
some text for thingOne
thingTwo::
some text for thingTwo
thingThree::
some text for thingThree
thingFour::
some text for thingFour

and then using that (out2) as input, this SED command (sedex3) does the rest: sed '/::/N ; s/::/: / ; s/\n//'

producing the desired end result (wantedText).

What am I not understanding and/or missing about sedex1 that it doesn't produce the desired result (wantedText) when those very same patterns used consecutively in sedex2 and then sedex3 do produce wantedText?


Solution

  • This might work for you (GNU sed):

    sed ':a;/::\s*/{x;s//: /p;x;h;d};/\S/H;$!d;g;ba' file
    

    Every time we see a line containing :: we want to process accumulated lines, print the result and start accumulating more lines.

    The accumulate lines are placed in the hold space, the head of these lines will be the one containing :: and subsequent non-empty lines are appended.

    At the end of the file, the last accumulated lines replace the current line in the pattern space and a goto assures that the final accumulated lines are processed as normal.

    Esoteric amelioration to the above solution:

    sed '/::\s*/{x;s//: /p;x;h;d};/\S/H;$!d;G;D' file
    

    Alternative opaque solution:

    sed '/\n/!{/\S/H;$!d;x;D};s/::\s*/: /;P;D' file