I'm trying to sort lists of names with Perl with a specific letter order to perform some special features.
The sorting would be working the same way as sort { $a cmp $b }
but with a different succession of letters.
For example, ordering with the arbitrary character order "abdrtwsuiopqe987654" ...
I tried to deal with sort { $a myFunction $b }
but I'm newbie with Perl and I don't see how to organize correctly myFunction
to get what I want.
cmp
function implemented with Perl to see how it works ?The following is probably the fastest[1]:
sub my_compare($$) {
$_[0] =~ tr{abdrtwsuiopqe987654}{abcdefghijklmnopqrs}r
cmp
$_[1] =~ tr{abdrtwsuiopqe987654}{abcdefghijklmnopqrs}r
}
my @sorted = sort my_compare @unsorted;
Or if you want something more dynamic, the following might be the fastest[2]:
my @syms = split //, 'abdrtwsuiopqe987654';
my @map; $map[ord($syms[$_])] = $_ for 0..$#syms;
sub my_compare($$) {
(pack 'C*', map $map[ord($_)], unpack 'C*', $_[0])
cmp
(pack 'C*', map $map[ord($_)], unpack 'C*', $_[1])
}
my @sorted = sort my_compare @unsorted;
We could compare character by character, but that will be far slower.
use List::Util qw( min );
my @syms = split //, 'abdrtwsuiopqe987654';
my @map; $map[ord($syms[$_])] = $_ for 0..$#syms;
sub my_compare($$) {
my $l0 = length($_[0]);
my $l1 = length($_[1]);
for (0..min($l0, $l1)) {
my $ch0 = $map[ord(substr($_[0], $_, 1))];
my $ch1 = $map[ord(substr($_[1], $_, 1))];
return -1 if $ch0 < $ch1;
return +1 if $ch0 > $ch1;
}
return -1 if $l0 < $l1;
return +1 if $l0 > $l1;
return 0;
}
my @sorted = sort my_compare @unsorted;
Technically, it can be made faster using GRT.
my @sorted =
map /\0(.*)/s,
sort
map { tr{abdrtwsuiopqe987654}{abcdefghijklmnopqrs}r . "\0" . $_ }
@unsorted;
Technically, it can be made faster using GRT.
my @sorted =
map /\0(.*)/s,
sort
map { ( pack 'C*', map $map[ord($_)], unpack 'C*', $_ ) . "\0" . $_ }
@unsorted;
cmp
is implemented by the scmp
operator.
$ perl -MO=Concise,-exec -e'$x cmp $y'
1 <0> enter
2 <;> nextstate(main 1 -e:1) v:{
3 <#> gvsv[*x] s
4 <#> gvsv[*y] s
5 <2> scmp[t3] vK/2
6 <@> leave[1 ref] vKP/REFC
The scmp
operator is implemented by the pp_scmp
function in pp.c
, which is really just a wrapper for sv_cmp_flags
in sv.c
when use locale;
isn't in effect. sv_cmp_flags
either uses C library function memcmp
or a UTF-8 aware version (depending on the type of scalar).