I need to find in files (xml) date in this format 2021-06-25T21:17:51Z
and replace them with this format 2021-06-25T21:17:51.001Z
I thought about using regexp with sed
but back references does not work.
1.xml could look like this, but I have much more fields in those files, and I got fields already correct.
<Doc>
<PUB_DATE>2021-06-25T21:17:51Z</PUB_DATE><!-- to change -->
<DATE_COLLECT_100>2021-06-25T21:17:51Z</DATE_COLLECT_100><!-- to change -->
<DATE_CREATION>2021-06-25T21:17:51.001Z</DATE_CREATION><!-- keep it like this -->
</Doc>
Desired output is
<Doc>
<PUB_DATE>2021-06-25T21:17:51.001Z</PUB_DATE><!-- to change -->
<DATE_COLLECT_100>2021-06-25T21:17:51.001Z</DATE_COLLECT_100><!-- to change -->
<DATE_CREATION>2021-06-25T21:17:51.001Z</DATE_CREATION><!-- keep it like this -->
</Doc>
Here is my sed
$ sed -Ee 's#<(PUB_DATE|DATE_COLLECT_100){1}>([[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}T[[:digit:]]{2}:[[:digit:]]{2}:[[:digit:]]{2})Z</\1>#<\1>\2.001Z</\1>#' 1.xml
The regexep seems to be OK in regex101
Here a representation of it made with https://regexper.com
Is back references allowed in sed when they are used in the search portion ?
Am I missing something about sed
?
Is there a bug ?
Sed version : well... I dont know, sed --version
sed -v
man sed
doesn't give it. I'm on OSX.
BSD or OSX sed doesn't support back-reference \1
in regex pattern.
Your choices are perl
:
perl -pe 's#<(PUB_DATE|DATE_COLLECT_100)>(\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2})Z</\1>#<\1>\2.001Z</\1>#' 1.xml
Or else install gnu sed
using home brew
installer and then use:
gsed -E 's#<(PUB_DATE|DATE_COLLECT_100)>([[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}T[[:digit:]]{2}:[[:digit:]]{2}:[[:digit:]]{2})Z</\1>#<\1>\2.001Z</\1>#' 1.xml