regexlinuxbashsedregex-recursion

Match multiple strings in multi lines and also replace multiple strings


I want to replace all the strings with DISPLAY="TRUE" to DISPLAY="FALSE" in the first line and vice versa in the next line in a single match.

Example: FROM:

 <SYN DISPLAY="TRUE" SEARCH="TRUE" CLASSIFY="TRUE">Appels</SYN>
 <SYN DISPLAY="FALSE" SEARCH="FALSE" CLASSIFY="TRUE">103.103117.1031171012</SYN>

TO

 <SYN DISPLAY="FALSE" SEARCH="TRUE" CLASSIFY="TRUE">Appels</SYN>
 <SYN DISPLAY="TRUE" SEARCH="FALSE" CLASSIFY="TRUE">103.103117.1031171012</SYN>

Note that all other strings in the corresponding line of <SYN DISPLAY="TRUE" or <SYN DISPLAY="FALSE" could be different.

The requirement is to match and replace only in the occurrence of both lines (shown above) i.e when <SYN DISPLAY="TRUE" is in 1st line and <SYN DISPLAY="FALSE" is in the second line. Single lines with the following example pattern should not be replaced.

<DIMENSION_NODE>
            <DVAL TYPE="EXACT">
               <DVAL_ID ID="4294960976"/>
               <SYN DISPLAY="TRUE" SEARCH="TRUE" CLASSIFY="TRUE">2</SYN>
            </DVAL>
         </DIMENSION_NODE>
    ```


I tried using sed, however, I couldn't make it work.

sed -E 's/(<SYN DISPLAY=\")TRUE(\".+\s+<SYN DISPLAY=\")FALSE(\".+<\/SYN>)/\1FALSE\2TRUE\3/' test.xml

Requesting experts help to make it work :)


Solution

  • With -z the newlines will be handled as normal characters:

    sed -zr 's/(SYN DISPLAY=)("TRUE")([^\n]*)\n([^\n]*)SYN DISPLAY=("FALSE")/\1\5\3\n\4\1\2/g' inputfile
    

    In your example the remembered strings are:

    \1=SYN DISPLAY=
    \2="TRUE"
    \3= SEARCH="TRUE" CLASSIFY="TRUE">Appels</SYN>
    \4= <
    \5="FALSE"
    

    Both lines are used for the match, a single line will not be changed.