In following data file, I want to consider each <Field>
tag as child tag of <Register>
and each <Register>
as child of <Partition>
. so, basically, I am trying to extract each <Partition>
details with corresponding <Register>
and <Field>
. Since all these tags are separate and not as child-parent relationship, how can I get my desired output?
Since the file is very large, I do not want to make it as child-parent relationship as it will require find/replace and manual intervention.
<Partition>
<Name>1</Name>
<Abstract>2</Abstract>
<Description>3</Description>
<ParentName>4</ParentName>
</Partition>
<Partition>
<Name>8</Name>
<Abstract></Abstract>
<Description>9</Description>
<ParentName>10</ParentName>
</Partition>
<Register>
<Name>12</Name>
<Abstract></Abstract>
<Description>13</Description>
<ParentName>14</ParentName>
<Size>32</Size>
<AccessMode>15</AccessMode>
<Type>16</Type>
</Register>
<Field>
<Name>17</Name>
<Abstract></Abstract>
<Description></Description>
<ParentName></ParentName>
</Field>
<Field>
.
.
.
</Field>
<Register>
.
.
.
</Register>
<Field>
.
.
.
</Field>
<Field>
.
.
.
</Field>
<Partition>
<Name>88</Name>
<Abstract></Abstract>
<Description></Description>
<ParentName>55</ParentName>
</Partition>
<Register>
.
.
.
</Register>
<Field>
.
.
.
</Field>
<Partition>
.
.
.
</Partition>
<Partition>
.
.
.
</Partition>
<Partition>
.
.
.
</Partition>
<Register>
.
.
.
</Register>
I am using XML::Twig
package and here is my code snippet:
foreach my $register ( $twig->get_xpath('//Register') ) # get each <Register>
{
#print $register, "\n";
my $reg_name = $register->first_child('Name')->text;
my $reg_abstract= $register->first_child('Abstract')->text;
my $reg_description= $register->first_child('Description')->text;
.
.
.
foreach my $xml_field ($register->get_xpath('Field'))
{
my $reg_field_name= $xml_field->first_child('Name')->text;
my $reg_field_abstract= $xml_field->first_child('Abstract')->text;
#print "$reg_field_name \n";
.
.
.
}
}
As per your comment, if you want to rewrite the file with Register
and Field
elements as children of Partition
elements, here is what you could do:
simplest solution, the whole file is loaded in memory:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
my $test_file= 'test.xml';
XML::Twig->new( twig_handlers => { 'Register|Field' => \&child,
},
pretty_print => 'indented',
)
->parsefile( $test_file)
->print;
sub child
{ my( $t, $child)= @_;
$child->move( last_child => $child->prev_sibling( 'Partition'));
}
Since you mentioned that the file can be very large, below is a slightly more complex version that only keeps in memory 2 Partition
elements (including the new children of the first one). When a Partition
is parsed it uses flush_up_to
to flush the tree, up to the previous Partition
:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
my $test_file= 'test.xml';
XML::Twig->new( twig_handlers => { 'Partition' => \&parent,
'Register|Field' => \&child,
},
pretty_print => 'indented',
)
->parsefile( $test_file);
sub child
{ my( $t, $child)= @_;
$child->move( last_child => $child->prev_sibling( 'Partition'));
}
sub parent
{ my( $t, $partition)= @_;
if( my $prev_partition = $partition->prev_sibling( 'Partition'))
{ $t->flush_up_to( $prev_partition); }
}
Note that since flush_up_to
is used, at the end of the parsing the rest of the tree is automatically flushed
If you need to write the XML to a specific file, instead of STDOUT, you can also pass a filehandle to flush_up_to
.