I am using a foreach
to loop through links. Do I need a $mech->back();
to continue the loop or is that implicit.
Furthermore do I need a separate $mech2
object for nested for each loops?
The code I currently have gets stuck (it does not complete) and ends on the first page where td#tabcolor3
is not found.
foreach my $sector ($mech->selector('a.link2'))
{
$mech->follow_link($sector);
foreach my $place ($mech->selector('td#tabcolor3'))
{
if (($mech->selector('td#tabcolor3', all=>1)) >= 1)
{
$mech->follow_link($place);
print $_->{innerHTML}, '\n'
for $mech->selector('td.dataCell');
$mech->back();
}
else
{
$mech->back();
}
}
You cannot access information from a page when it is no longer on display. However, the way foreach
works is to build the list first before it is iterated through, so the code you have written should be fine.
There is no need for the call to back
as the links are absolute. If you had used click
then there must be a link in the page to click on, but with follow_link
all you are doing is going to a new URL.
There is also no need to check the number of links to follow, as a for
loop over an empty list will simply not be executed.
To make things clearer I suggest that you assign the results of selector
to an array before the loop.
Like this
my @sectors = $mech->selector('a.link2');
for my $sector (@sectors) {
$mech->follow_link($sector);
my @places = $mech->selector('td#tabcolor3');
for my $place (@places) {
$mech->follow_link($place);
print $_->{innerHTML}, '\n' for $mech->selector('td.dataCell');
}
}
Update
My apologies. It seems that follow_link
is finicky and needs to follow a link on the current page.
I suggest that you extract the href
attribute from each link and use get
instead of follow_link
.
my @selectors = map $_->{href}, $mech->selector('a.link2');
for my $selector (@selectors) {
$mech->get($selector);
my @places = map $_->{href}, $mech->selector('td#tabcolor3');
for my $place (@places) {
$mech->get($place);
print $_->{innerHTML}, '\n' for $mech->selector('td.dataCell');
}
}
Please let me know whether this works on the site you are connecting to.