I need to parse some HTML codes. The patterns of the tag ID are:
<tr id="date">.....</tr>
<tr id="band01"><td>field1</td><td>field2</td></tr>
<tr id="band02">...contents...</tr>
.....
<tr id="(others">.....
I'm using PERL Mojo::DOM parser, and want to extract all the actual ids with names starting with "band" following by a number, as well as its contents.
How could I achieve this?
The E[foo^="bar"] selector matches any element with a "foo" attribute starting with "bar". Thus you can use:
my $dom = Mojo::DOM->new($html);
my $rows = $dom->find('tr[id^="band"]');
$rows
would be a Mojo::Collection of Mojo::DOM objects representing each matching element and its respective contents. For example, to get the list of matched IDs:
my @ids = $rows->map(attr => 'id')->each;
Or with more standard Perl:
my @ids = map { $_->{id} } @$rows;