I have an XML file whose root element tag is <__>
(two underscores). When, however, that tag name is used in the twig_handlers list XML::Twig->new dies with the error message:
unrecognized expression in handler: '__'
Actually, ANY tag starting with an underscore produces this error except for Twig's special tags _all_
and _default_
, either of which I can use to process the file at the expense of throwing away all the handler callbacks except the last.
The invocation which fails is:
XML::Twig->new (twig_handlers => { '__' => \&show })
I imagine there's an XML::Twig Xpath expression which can be used here but the CPAN documentaton is pretty vague about their syntax. I also now wonder what I'd have to do to get at an element <_all_>
:)
If anyone has a suggestion it would be much appreciated.
The problem only occurs when the twig is created since once processing has started (using the callback expression _all_
), <__>
elements at any level in the input are processed normally.
If anyone wants to play with the problem, here's the program I was using to try finding a solution. Set $xpath to the expression you want to test.
use strict;
use XML::Twig;
my $xpath = '_all_'; # <---- fails if one puts '__' here
my $xml = <<EOS; # <---- here's the XML data to process
<__>
<AA>first</AA>
<__>second</__>
</__>
EOS
sub show {
print "handler called for element ", $_->gi, ", whose children are\n";
my @children = $_->children;
for my $elt (@children) {
print "\t", $elt->gi, " holds \"", $elt->text, "\"\n";
}
1;
}
my $twig = XML::Twig->new (twig_handlers => { $xpath => \&show });
$twig->parse ($xml);
Which version of XML::Twig are you using? This is a bug that was fixed in version 3.38.
From the Changes file:
version 3.38
date: 2011-02-27
# minor maintenance release
fixed: RT 65865: _ should be allowed at the start on an XML name
https://rt.cpan.org/Ticket/Display.html?id=65865
reported by Steve Prokopowich
And indeed when I use '__' as the value for $xpath
the code runs without errors, and gives the correct output.