Like in this snippet:
<p>content 1 of p <span>content of span</span> content 2 of p </p>
I would like to only obtain the following:
content 1 of p
and content 2 of p
, not content of span
.
Is there a way to do it?
Using an XPath:
for my $text_node ($node->findnodes('text()')) {
say $text_node;
}
Without using an XPath:
for my $child_node ($node->childNodes()) {
next if $child_node->nodeType != XML_TEXT_NODE;
say $child_node;
}
Both output the following:
content 1 of p
content 2 of p
The rest of the program:
use strict;
use warnings;
use feature qw( say );
use XML::LibXML qw( XML_TEXT_NODE );
my $xml = '<p>content 1 of p <span>content of span</span> content 2 of p </p>';
my $doc = XML::LibXML->new->parse_string($xml);
my $node = $doc->documentElement();