bashshellsubstitution

Replace everything from string1 to string2 in file including newlines


I have a file with content like this:
(marked as js for better readability only, could be any plain text file)

some text
/*%SKIP% line comment %SKIP%*/
some text
/*%SKIP%
block
comment
I could contain everything except the end sequence
%SKIP%*/
some text

Now I want to remove everything between /*%SKIP% and %SKIP%*/, so that the file contains:

some text
some text
some text

Whether blank lines resist in the resulting file or not is not that relevant, but preferably no blank lines remain at the locations where something got removed. I was able to archieve this for single lines with sed, but failed when it comes to multi line content.

I guess it shouldn't matter that much, but as a sidenote: the "start" and "end" string are variable and stored in bash variables open_tag=/*%SKIP% and close_tag=%SKIP%*/.

Only limitation is to use tools which are commonly pre installed on most linux distributions, so sed, awk, perl and grep should all be fine.

How can I achieve this?


Solution

  • Using a perl one-liner:

     $ cat input.txt
    some text
    /*%SKIP% line comment %SKIP%*/
    some text
    /*%SKIP%
    block
    comment
    I could contain everything except the end sequence
    %SKIP%*/
    some text
    $ perl -0777 -pe 's{\R?/\*%SKIP%.*?%SKIP%\*/}{}sg' input.txt
    some text
    some text
    some text
    

    This reads the entire file at once (-0777, perl 5.36.0 and newer can use -g instead), and replaces every SKIP block (Optionally preceded by a linebreak; used to help prevent empty lines in the output) with an empty string. Using .*? does non-greedy matching so it doesn't match everything between the first /*%SKIP% and the last %SKIP%*/, and the s option allows . to match newlines (g means for every match like in sed).