xmlperlxml-libxml

When is a new() call necessary for perl XML::LibXML


I wrote a program using perl's XML::LibXML

It creates a XML::LibXML using load_xml().

I was surprised that it doesn't need a call to new().

When is a call to new() necessary?

Here is the code that surprisingly (to me) works:

#!/usr/bin/env perl
use 5.020;
use warnings;
use XML::LibXML;

#the input xml
my $inputstr = <<XML;
<a>
<b class="type1">some type 1 data</b>
<b class="type2">some type 2 data</b>
<b class="type3">some type 3 data</b>
<b class="type4">some type 4 data</b>
<b notaclass="type1">don't change this</b>
<c class="type1">don't change this either</c>
</a>
XML

my $dom = XML::LibXML->load_xml(
string => $inputstr
);

say $dom->toString();

Solution

  • XML::LibXML uses a style where methods work as both object and class methods. This is for convenience. If you call XML::LibXML->load_xml it will create a parser object for you, configure it using your arguments, load the XML, and then throw the parser object out.

    sub load_xml {
      my $class_or_self = shift;
      my %args = map { ref($_) eq 'HASH' ? (%$_) : $_ } @_;
     
      my $URI = delete($args{URI});
      $URI = "$URI"  if defined $URI; # stringify in case it is an URI object
      my $parser;
      # if called as an object method
      if (ref($class_or_self)) { 
        $parser = $class_or_self->_clone();
        $parser->{XML_LIBXML_PARSER_OPTIONS} = $parser->_parser_options(\%args);
      # if called as a class method
      } else {
        $parser = $class_or_self->new(\%args);
      }
    

    That means these two pieces of code are equivalent.

    my $dom = XML::LibXML->load_xml(
      string => $inputstr
    );
    
    my $dom = XML::LibXML->new(
      string => $inputstr
    )->load_xml(
      string => $inputstr
    );
    

    When is a call to new() necessary?

    When you want to change the default parser options and reuse the parser object. Or when you want to use a method that is only available as an object method.

    For example, you might want to pass a configured XML parser into another function, or use it to configure the behavior of another object.

    my $parser = XML::LibXML->new(pedantic_parser => 1);
    my $object = Some::Class->new(xml_parser => $parser);
    
    # This will use $parser which has been configured to be pedantic.
    $object->do_something_with_an_xml_document($document);