xmlperltwigxml-twig

XML::Twig is adding empty newline for values provided in multi-line


I am using XML::Twig to parse the file in my perl script. I am bit new to this. I got following kind of entries (sample example here) in my XML file:

<?xml version="1.0" encoding="UTF-8"?>
<mytag1 name="abc">
    <mytag2>This is line 1.
        This is line 2.
        This is line 3.
     </mytag2>
</mytag1>

And in my perl script, I am doing something like:

my $twig = XML::Twig->new( keep_encoding=>1, keep_atts_order=>1, pretty_print => 'indented', comments => 'keep' );
$twig->parsefile($in_file);

I have some validation code around after which following kind of output is getting generated.

<?xml version="1.0" encoding="UTF-8"?>
<mytag1 name="abc">
    <mytag2>This is line 1.

        This is line 2.

        This is line 3.

     </mytag2>
</mytag1>

The extra blank lines are getting generated in output, I am not sure what's going wrong. I tried to search around but couldn't find much useful information on this. Any help will be appreciated.


Solution

  • Remove the keep_encoding option. It's useless since the input is in utf-8, and it makes the module bypass some of the parser features, notably the one that normalizes LF/CR

    It should not be used anyway: it's a relic of a time when Unicode was not as prevalent as today. It allowed people stuck with old encodings to still be able to process their XML.

    Thanks ikegami!