perlhtml-tablehtml-tableextract

Perl printing rows and columns from HTML table


Here's my temp.html

<table border="1">
<tr>
<td>row 1, cell 1</td>
<td>row 1, cell 2</td>
</tr>
<tr>
<td>row 2, cell 1</td>
<td>row 2, cell 2</td>
</tr>
</table>

I am trying to print each element in above table using below code -

#!/usr/bin/perl

use strict;
use Data::Dumper;
use HTML::TableExtract;

my $tex = HTML::TableExtract->new(keep_html=>1);

$tex->parse_file('./temp.html');
my ($table) = $tex->tables;
#print Dumper($table);

my $numColumns = @{$table->rows->[0]};
print "\n numColumns = $numColumns\n";
my $numRows = @{$table->rows};
print "\n numRows = $numRows\n";

for my $rowIndex ( 0..$numRows-1 ) { 
    for my $columnIndex ( 0..$numColumns-1 ) { 
       print "\n row $rowIndex column $columnIndex $table->rows->[$rowIndex][$columnIndex] ";
    }   
}

It prints -

row 0 column 0 HTML::TableExtract::Table=HASH(0x8e7d7f8)->rows->[0][0] 
row 0 column 1 HTML::TableExtract::Table=HASH(0x8e7d7f8)->rows->[0][1] 
row 1 column 0 HTML::TableExtract::Table=HASH(0x8e7d7f8)->rows->[1][0] 
row 1 column 1 HTML::TableExtract::Table=HASH(0x8e7d7f8)->rows->[1][1]

If I use @{$table->rows->[$rowIndex]}->[$columnIndex] instead of $table->rows->[$rowIndex][$columnIndex] I get correct output, but with a warning. How to remove the warning?

Using an array as a reference is deprecated at t.pl line 21.

row 0 column 0 row 1, cell 1 
row 0 column 1 row 1, cell 2 
row 1 column 0 row 2, cell 1 
row 1 column 1 row 2, cell 2

Solution

  • You cannot call methods inside strings. While you can dereference variables inside strings and can access elements from hashes or arrays as well, method calls are not supported.

    Instead of

    print "... $table->rows->[$rowIndex][$columnIndex] ";
    

    You want

    my $cell_value = $table->rows->[$rowIndex][$columnIndex];
    print "... $cell_value ";
    

    Other alternatives include using some kind of dereference. You found a solution like

    print "... ${$table->rows->[$rowIndex]}[$columnIndex] ";
    

    which works because the method call is now inside a dereferenced block, which can include arbitrary code. A more common way is to use the “shopping cart” pseudo-operator @{[ ... ]}, which allows interpolation of arbitrary code:

    print "... @{[ $table->rows->[$rowIndex][$columnIndex] ]} ";