xmlperlawkxml-serializationxml-declaration

remove xml declaration from xml Data using perl or awk


Our application is at the receiving-end to do retro-analysis of XML data. Our application doesn't have Java or .NET available, but runs in Unix, so it has awk and Perl.

The XML messages in the file contains:

<?xml version="1.0" encoding="ISO-8859-1" ?> 

I tried a few options in Perl and awk to get them removed, but couldn't get these to work:

perl -p -i -e "s/<?xml version="1.0" encoding="ISO-8859-1" ?>//g"  inputFile
perl -p -i -e "s/<?xml version=\"1.0\" encoding=\"ISO-8859-1\" ?>//g"  inputFile
perl -p -i -e "s/<\?xml version="1.0" encoding="ISO-8859-1" \?>//g"  inputFile

Any other option to do this using PERL or AWK?


Solution

  • This worked for me without overwriting the data file:

    perl -p -e 's/<\?xml version="1.0" encoding="ISO-8859-1" \?>//g'
    

    I'd only overwrite the file (-i) when I was sure I'd got the basic regex working without doing damage.