phppackunpack

Implement unpack b from Perl in PHP


Given a string of bytes I want their binary representation such that the bits of each byte are ordered from least to most significant (as produced by Perl's unpack "b*").

For example,

"\x28\x9b"

should return

"0001010011011001"

In this post Ilmari Karonen described how to achieve pack "b*" in PHP. So I thought all I have to do is split the hex string into bytes and run them through base_convert.

function unpack_B($data) {
    $unpacked = unpack("H*", $data)[1];
    $nibbles = str_split($unpacked, 2);
    foreach ($nibbles as $i => $nibble) {
        $nibbles[$i] = base_convert($nibble, 16, 2);
    }
    return implode("", $nibbles);
}

However, It's returning something different.

What am I missing here?


Solution

  • Looking at the docs for perl's pack() it seems like B is the usual "big endian" [I know I'm abusing this term] "descending" bit order, and b is "little endian"/"ascending".

    I honestly cannot parse what on earth that the code/answer you've linked is supposed to do, so I've written it all from scratch based on what the perl docs say the pack arguments do.

    function bin_to_litbin($input, $be=true) {
        return implode(
            '',
            array_map(
                function($a)use($be){
                    $ret = str_pad(decbin(ord($a)), 8, '0', STR_PAD_LEFT);
                    if(!$be) {
                        $ret = strrev($ret);
                    }
                    return $ret;
                },
                str_split($input, 1)
            )
        );
    }
    
    function litbin_to_bin($input, $be=true) {
        return implode(
            '',
            array_map(
                function($a)use($be){
                    if(!$be) {
                        $a=strrev($a);
                    }
                    return chr(bindec($a));
                },
                str_split($input, 8)
            )
        );
    }
    
    $hex = '00289b150065b302a06c560094cd0a80';
    $bin = hex2bin($hex);
    
    var_dump(
        $hex,
        $cur =  bin_to_litbin($bin, false),
        bin2hex(litbin_to_bin($cur, false))
    );
    

    where $be=true is B/"big endian" and $be=false is b/"little endian".

    Output:

    string(32) "00289b150065b302a06c560094cd0a80"
    string(128) "00000000000101001101100110101000000000001010011011001101010000000000010100110110011010100000000000101001101100110101000000000001"
    string(32) "00289b150065b302a06c560094cd0a80"
    

    Though truth be told I cannot think of any practical reason to ever encode data as literal zero and one characters. It is wildly unnecessary and wasteful compared to literally any other encoding. I would wager that that is why B and b were never implemented in PHP.

    Base64 is 1.33x the length its input, hex is 2x, and literal binary is 8x.