perlintegerradixstring-parsingparseint

Parse string to integer in arbitrary base/radix


In other languages besides Perl, if I want to parse a string representing a number in a weird, non-standard base (not necessarily the normal binary (2), octal (8), decimal (10), or hexadecimal (16) bases), or the base must be determined programmatically at runtime, converting from string to integer in said base is pretty simple, so long as the base is between 2 and 36 (where bases above 10 use a-z and/or A-Z to represent the larger digits), e.g., to parse "JQ" in base-34, getting 672 as the result, I can do:

int("JQ", 34)  # Python
parseInt("JQ", 34);  // JavaScript
strtol("JQ", NULL, 34);     // C or C++, also strtoul, strtoll, strtoull
std::stol("JQ", NULL, 34);  // C++-only, best when first argument already a std::string, similar options as strto*
$((34#JQ))  # Even bash can do it with built-in functionality

Is there a built-in way to do this string-to-integer parsing, with arbitrary bases (ideally able to be provided at runtime) in Perl that doesn't rely on third-party modules, and doesn't involve me writing a ton of code to reinvent the wheel? (I'm not looking for anything that involves manually looping to convert digit-by-digit and multiply into a final value; that's long, tedious, error-prone, and reinventing the wheel)

I figure if the answer is "No", if nothing else I can cheat by having bash do the work for me with backticks or the like, but it feels kind of weird that Perl, alone among common languages, including languages with similar designs and application (e.g. Ruby, PHP, bash), is missing this functionality.


Solution

  • There is no reason to execute a shell just to parse a number in an arbitrary base. The standard (Not third-party) POSIX module provides the strtol() function, which, just like the C version it's a wrapper for, lets you specify a base between 2 and 36:

    #!/usr/bin/env perl
    use strict;
    use warnings;
    use feature qw/say/;
    use POSIX;
    
    say scalar POSIX::strtol("JQ", 34);
    

    The thing that needs remembering so it doesn't cause issues is that it returns a two-element list in list context (The converted number and the count of trailing characters not part of it), and just the converted number in scalar context.