performanceperlreferencehashref

How can I use "map" in "perl" to return a hash reference whose key is looked up from an array reference whose value is looked up from another array?


I've searched other many Stack questions on map however this requirement is particular and well try as I might I cannot quite get the solution I am looking for, or I think that does exist.

This question is simply about performance.

As limited, background, this code segment used in decoding incoming tokens so it's used on every web request and therefore the performance is critical and I know "map" can be used so want to use it.

Here is a trimmed down but nevertheless fully working code segment which I am currently using and works perfectly well:

use strict;
use Data::Dumper qw (Dumper);

my $api_token = { array => [ 'user_id', 'session_id', 'expiry' ], max => 3, name => 'session' };
my $token_got = [ 9923232345812112323, 1111323232000000465, 1002323001752323232 ];

my $rt;
for (my $i=0; $i<scalar @{$api_token->{array}}; $i++) {
  $rt->{$api_token->{array}->[$i]} = $token_got->[$i];
}

$rt->{type} = $api_token->{name};
print Dumper ($rt) . "\n";

The question is: What is the absolute BEST POSSIBLE PERL CODE to replicate the foreach statement above in terms of performance?


Solution

  • Looks like you only need a hash slice

    my %rt;
    
    @rt{ @{ $api_token->{array} } } = @$token_got;
    

    Or, if the hash reference is needed

    my $rt;
    
    @{ $rt } { @{ $api_token->{array} } } = @$token_got;
    

    or with the newer postfix dereferencing, on both array and hash slices, perhaps a bit nicer

    my $rt;
    
    $rt->@{ $api_token->{array}->@* } = @$token_got;
    

    One can also do it using List::MoreUtils::mesh, and in one statement

    my $rt = { mesh @{ $api_token->{array} }, @$token_got };
    

    or with pairwise from the same library

    my $rt = { pairwise { $a, $b } @{ $api_token->{array} }, @$token_got };
    

    These go via C code if the library gets installed with List::MoreUtils::XS.


    Benchmarked all above, with the tiny datasets from the question (realistic though?), and whatever implementation mesh/pairwise have they are multiple times as slow as the others.

    On an old laptop with v5.26

                  Rate use_pair use_mesh use_href use_post use_hash
    use_pair  373639/s       --     -36%     -67%     -67%     -68%
    use_mesh  580214/s      55%       --     -49%     -49%     -51%
    use_href 1129422/s     202%      95%       --      -1%      -5%
    use_post 1140634/s     205%      97%       1%       --      -4%
    use_hash 1184835/s     217%     104%       5%       4%       --
    

    On a server with v5.36 the numbers are around 160%--170% against pairwise (with mesh being a bit faster than it, similarly to above)

    Of the others, on the laptop the hash-based one is always a few percent quicker, while on a server with v5.36 they are all very close. Easy to call it a tie.


    The following is edit by OP, who timed a 61% speedup (see comments)

    CHANGED CODE:

    @rt{ @{ $api_token->{array} } } = @$token_got; ### much faster onliner replaced the loop. @zdim credit