I have an 8MB CSV file. Using Spreadsheet::Read
it takes 10 seconds to read:
my $book = ReadData ( 'file.csv' );
my @rows = Spreadsheet::Read::rows($book->[1]); # first sheet
foreach my $i (2 .. scalar @rows) { # ignore first header row
my $first = $rows[$i-1][1];
#...
}
Using Text::CSV_XS
, it takes 1 second:
open my $fh, "<:encoding(utf8)", 'file.csv' or die $!;
my $csv = Text::CSV_XS->new ({ diag_verbose=>1, auto_diag=>1, binary=>1, sep_char=>";" });
$csv->getline($fh); # Ignore Header
while (my $row = $csv->getline ($fh)) {
my $first = $row->[1];
#...
}
close ($fh);
Can I force Spreadsheet::Read
to use Text::CSV_XS
and expect similar peformance? I tried:
my $book = Spreadsheet::Read->new (
'file.csv',
sep => ';',
parser => 'csv',
);
$ENV{SPREADSHEET_READ_CSV} = 'Text::CSV_XS';
Output of Spreadsheet::Read->parsers()
is:
$VAR1 = {
'ext' => 'csv',
'def' => '',
'mod' => 'Text::CSV',
'min' => '1.17',
'vsn' => '-'
};
$VAR2 = {
'ext' => 'csv',
'def' => '',
'mod' => 'Text::CSV_PP',
'min' => '1.17',
'vsn' => '-'
};
$VAR3 = {
'vsn' => '1.50',
'min' => '0.71',
'ext' => 'csv',
'mod' => 'Text::CSV_XS',
'def' => '*'
};
$VAR4 = {
'min' => '0.01',
'vsn' => '0.87',
'def' => '*',
'mod' => 'Spreadsheet::Read',
'ext' => 'sc'
};
$VAR5 = {
'vsn' => '0.65',
'min' => '0.34',
'ext' => 'xls',
'mod' => 'Spreadsheet::ParseExcel',
'def' => '*'
};
$VAR6 = {
'min' => '0.24',
'vsn' => '0.27',
'ext' => 'xlsm',
'def' => '*',
'mod' => 'Spreadsheet::ParseXLSX'
};
$VAR7 = {
'min' => '0.24',
'vsn' => '0.27',
'def' => '*',
'mod' => 'Spreadsheet::ParseXLSX',
'ext' => 'xlsx'
};
$VAR8 = {
'min' => '0.13',
'vsn' => '-',
'ext' => 'xlsx',
'def' => '',
'mod' => 'Spreadsheet::XLSX'
};
$VAR9 = {
'vsn' => undef,
'min' => '',
'ext' => 'zzz2',
'mod' => 'Z20::Just::For::Testing',
'def' => '*'
};
also:
$ perl -MSpreadsheet::Read -E'say Spreadsheet::Read::parses( "csv" )'
Text::CSV_XS
$ perl -MText::CSV_XS -E'say Text::CSV_XS->VERSION'
1.50
You asked if you could force Spreadsheet::Read to use Text::CSV_XS.
But you also said the output from the following is Text::CSV_XS
.
perl -Mv5.14 -MSpreadsheet::Read -e'say Spreadsheet::Read::parses( "csv" )'
This demonstrates that Text::CSV_XS is being used.