regexmacossedbackreference

sed - Back reference on match pattern does not work


I need to find in files (xml) date in this format 2021-06-25T21:17:51Z and replace them with this format 2021-06-25T21:17:51.001Z

I thought about using regexp with sed but back references does not work.

1.xml could look like this, but I have much more fields in those files, and I got fields already correct.

<Doc>
   <PUB_DATE>2021-06-25T21:17:51Z</PUB_DATE><!-- to change -->
   <DATE_COLLECT_100>2021-06-25T21:17:51Z</DATE_COLLECT_100><!-- to change -->

   <DATE_CREATION>2021-06-25T21:17:51.001Z</DATE_CREATION><!-- keep it like this -->
</Doc>

Desired output is

<Doc>
   <PUB_DATE>2021-06-25T21:17:51.001Z</PUB_DATE><!-- to change -->
   <DATE_COLLECT_100>2021-06-25T21:17:51.001Z</DATE_COLLECT_100><!-- to change -->

   <DATE_CREATION>2021-06-25T21:17:51.001Z</DATE_CREATION><!-- keep it like this -->
</Doc>

Here is my sed

$ sed -Ee 's#<(PUB_DATE|DATE_COLLECT_100){1}>([[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}T[[:digit:]]{2}:[[:digit:]]{2}:[[:digit:]]{2})Z</\1>#<\1>\2.001Z</\1>#' 1.xml

The regexep seems to be OK in regex101

Here a representation of it made with https://regexper.com representation of the regexp

Is back references allowed in sed when they are used in the search portion ? Am I missing something about sed ? Is there a bug ?

Sed version : well... I dont know, sed --version sed -v man sed doesn't give it. I'm on OSX.


Solution

  • BSD or OSX sed doesn't support back-reference \1 in regex pattern.

    Your choices are perl:

    perl -pe 's#<(PUB_DATE|DATE_COLLECT_100)>(\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2})Z</\1>#<\1>\2.001Z</\1>#' 1.xml
    

    Or else install gnu sed using home brew installer and then use:

    gsed -E 's#<(PUB_DATE|DATE_COLLECT_100)>([[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}T[[:digit:]]{2}:[[:digit:]]{2}:[[:digit:]]{2})Z</\1>#<\1>\2.001Z</\1>#' 1.xml