I have a csv file,
id1,v1,v2,v3,v4
id2,v1,v2,v6,v4
id1,v7,v8,v3,v9
id1,v10,v11,v12,v13
id2,v3,v5,v8,v7
since, the file is not sorted, and should not be! I want output as:
id1,v1|v7|v10,v2|v8|v11,v3|v12,v4|v9|v13
id2,v1|v10,v2|v5,v6|v8,v4|v7
Where, all respective values in columns are merged to respective column in record with same id, except repeated value (see v3 in 3rd column for id1) and id.
I tried it using code given here http://www.robelle.com/tips/st-export-notes.html. But it needs much more than that.
How this can be achieved using perl? I am new to perl. Thanks in advance!
Assuming you don't need any particular sort order, you can use a hash of arrays to solve this. Hashes are known as dictionaries in other languages.
use strict;
use warnings;
my %data;
while ( <DATA> ) {
my ($id, @vals) = /[^,\s]+/g;
for my $i ( 0 .. $#vals ) {
++$data{$id}[$i]{$vals[$i]};
}
}
while ( my ($id, $vals) = each %data ) {
my @vals = map { join '|', keys %$_ } @$vals;
printf "%s,%s\n", $id, join ',', @vals;
}
__DATA__
id1,v1,v2,v3,v4
id2,v1,v2,v6,v4
id1,v7,v8,v3,v9
id1,v10,v11,v12,v13
id2,v3,v5,v8,v7
output
id2,v1|v3,v5|v2,v8|v6,v7|v4
id1,v7|v10|v1,v11|v2|v8,v12|v3,v4|v13|v9