I have a tsv file foo.tsv with colnames: "a", "b", "c", "d". I want to read this file and load its content to PDL matrix. File foo.tsv looks like this:
a b c d
1 6 7 4
2 7 6 10
3 8 5 6
4 9 4 8
5 10 3 7
I used this code to read file to the matrix and print it:
use PDL::Core qw(pdl);
use PDL::IO::CSV ':all';
# Header set to the first row following https://github.com/kmx/pdl-io-csv
# Sep_char set to the tab
my $data = rcsv2D('foo.tsv', {text2bad => 1, header => 1, sep_char => "\t"});
print $data;
The printed matrix is wrong as it lacks the first row with numbers after the header:
[
[ 2 3 4 5]
[ 7 8 9 10]
[ 6 5 4 3]
[10 6 8 7]
]
I changed the header value to 'auto' which should skip rows that have in all columns non-numeric values:
my $data = rcsv2D('foo.tsv', {text2bad => 1, header => 'auto', sep_char => "\t"});
Now I get a warning but a matrix looks ok:
Argument "auto" isn't numeric in foreach loop entry at C:/sw/pdl/perl/vendor/lib/PDL/IO/CSV.pm line 335, <DATA> line 207.
[
[ 1 2 3 4 5]
[ 6 7 8 9 10]
[ 7 6 5 4 3]
[ 4 10 6 8 7]
]
I do not understand why the resulting matrices do differ and why I do get a wrong result by setting header to the first row with header => 1 ?
It appears to be a bug that was fixed in 0.011.
0.011 2019/12/04
- fix: header option eats extra line #2
- fix: cpantesters failure on long-double perls
With 0.011, your code works fine.
use strict;
use warnings;
use PDL::IO::CSV ':all';
my $data = rcsv2D('foo.tsv', {text2bad => 1, header => 1, sep_char => "\t"});
print $data;
$ perl -e'
CORE::say join "\t", @$_
for
[qw( a b c d )],
# -- -- -- --
[qw( 1 6 7 4 )],
[qw( 2 7 6 10 )],
[qw( 3 8 5 6 )],
[qw( 4 9 4 8 )],
[qw( 5 10 3 7 )];
' >foo.tsv
$ perl a.pl
[
[ 1 2 3 4 5]
[ 6 7 8 9 10]
[ 7 6 5 4 3]
[ 4 10 6 8 7]
]
(Note that header=>'auto'
is not supported by rcsv2D
, and is being treated as header=>0
after issuing the warning you reported.)