pythonmemorybit-packing

How to pack arbitrary bit sequence in Python?


I want to encode/compress some binary image data as a sequence if bits. (This sequence will, in general, have a length that does not fit neatly in a whole number of standard integer types.)

How can I do this without wasting space? (I realize that, unless the sequence of bits has a "nice" length, there will always have to be a small amount [< 1 byte] of leftover space at the very end.)

FWIW, I estimate that, at most, 3 bits will be needed per symbol that I want to encode. Does Python have any built-in tools for this kind of work?


Solution

  • There's nothing very convenient built in but there are third-party modules such as bitstring and bitarray which are designed for this.

    from bitstring import BitArray
    s = BitArray('0b11011')
    s += '0b100'
    s += 'uint:5=9'
    s += [0, 1, 1, 0, 1]
    ...
    s.tobytes()
    

    To join together a sequence of 3-bit numbers (i.e. range 0->7) you could use

    >>> symbols = [0, 4, 5, 3, 1, 1, 7, 6, 5, 2, 6, 2]
    >>> BitArray().join(BitArray(uint=x, length=3) for x in symbols)
    BitArray('0x12b27eab2')
    >>> _.tobytes()
    '\x12\xb2~\xab '
    

    Some related questions: