rubybinary-dataruby-1.9.3packunpack

How does pack() and unpack() work in Ruby


In Ruby why we need array Packing? How does directive help to do such packing?

I ran some code in my console to see what and how directives looks like in Array packing.But the output is closely same with each directives. Then in core how they differs?

irb(main):003:0> n = [ 65, 66, 67 ]
=> [65, 66, 67]
irb(main):004:0> n.pack("ccc")
=> "ABC"
irb(main):005:0> n.pack("C")
=> "A"
irb(main):006:0> n.pack("CCC")
=> "ABC"
irb(main):007:0> n.pack("qqq")
=> "A\x00\x00\x00\x00\x00\x00\x00B\x00\x00\x00\x00\x00\x00\x00C\x00\x00\x00\x00\
x00\x00\x00"
irb(main):008:0> n.pack("QQQ")
=> "A\x00\x00\x00\x00\x00\x00\x00B\x00\x00\x00\x00\x00\x00\x00C\x00\x00\x00\x00\
x00\x00\x00"
irb(main):009:0> n.pack("SSS")
=> "A\x00B\x00C\x00"
irb(main):010:0> n.pack("sss")
=> "A\x00B\x00C\x00"
irb(main):011:0>

Now I can see from the console that n.pack("SSS") and n.pack("sss");n.pack("ccc") and n.pack("CCC"); n.pack("qqq") and n.pack("QQQ") gives the same output. Then where the differences are?

And the docs also not covered a bit of example of how each directive works on in real life program. I am also confused with the below directives as i don't know how to test them? any small code with them also helpful for me:


Solution

  • You are asking a question about the fundamental principles of how computers store numbers in memory. For example you can look at these to learn more:

    http://en.wikipedia.org/wiki/Computer_number_format#Binary_Number_Representation
    http://en.wikipedia.org/wiki/Signed_number_representations

    As an example take the difference between S and s; both are used for packing and unpacking 16-bit numbers, but one is for signed integers and the other for unsigned. This has significant meaning when you want to unpack the string back into the original integers.

    S: 16-bit unsigned means numbers 0 - 65535 (0 to (2^16-1))
    s: 16-bit signed integer numbers -32768 - 32767 (-(2^15) to (2^15-1)) (one bit used for sign)

    The difference can be seen here:

    # S = unsigned: you cannot pack/unpack negative numbers
    > [-1, 65535, 32767, 32768].pack('SSSS').unpack('SSSS')
    => [65535, 65535, 32767, 32768]   
    
    # s = signed: you cannot pack/unpack numbers outside range -32768 - 32767
    > [-1, 65535, 32767, 32768].pack('ssss').unpack('ssss')
    => [-1, -1, 32767, -32768]
    

    So you see you have to know how numbers are represented in computer memory in order to understand your question. Signed numbers use one bit to represent the sign, while unsigned numbers do not need this extra bit, but you cannot represent negative numbers then.

    This is the very basic of how numbers are represented as binary in computer memory.

    The reason you need packing for example is when you need to send numbers as a byte stream from one computer to another (like over a network connection). You have to pack your integer numbers into bytes in order to be sent over a stream. The other option is to send the numbers as strings; then you encode and decode them as strings on both ends instead of packing and unpacking.

    Or let's say you need to call a C-function in a system library from Ruby. System libraries written in C operate on basic integers (int, uint, long, short, etc.) and C-structures (struct). You will need to convert your Ruby integers into system integers or C-structures before calling such system methods. In those cases pack and unpack can be used to interface which such methods.


    Regarding the additional directives they deal with the endianness of how to represent the packed byte sequence. See here on what endianness means and how it works:

    http://en.wikipedia.org/wiki/Endianness

    In simplified terms it just tells the packing method in which order the integers should be converted into bytes:

    # Big endian
    > [34567].pack('S>').bytes.map(&:to_i)
    => [135, 7]   
    # 34567 = (135 * 2^8) + 7
    
    # Little endian
    > [34567].pack('S<').bytes.map(&:to_i)
    => [7, 135]   
    # 34567 = 7 + (135 * 2^8)