I need to pack information as closely as possible into a bitstream.
I have variables with a different number of distinct states:
Number_of_states=[3,5,129,15,6,2]# A bit longer in reality
The best option I have in the Moment would be to create a bitfield, using
2+3+8+4+3+1 bit ->21 bit
However it should be possible to pack these states into np.log2(3*5*129*15*6*2)=18.4
bits, saving two bits. (In reality I have 298 bits an need to save a few)
In my case this would save about >5% of the data stream, which would help a lot.
Is there a viable solution in python to pack the data in this way? I tried packalgorithms
, but they create too much overhead with just a few bytes of data. The string is no problem, it is constant and will be transmitted beforehand.
This is the code I am using in the moment:
from bitstring import pack
import numpy as np
DATA_TO_BE_PACKED=np.zeros(6)
Number_of_states=[3,5,129,15,6,2]#mutch longer in reality
DATA_TO_BE_PACKED=np.random.randint(Number_of_states)
string=''
for item in Number_of_states:
string+='uint:{}, '.format(int(np.ceil(np.log2(item))))
PACKED_DATA = pack(string,*DATA_TO_BE_PACKED)
print(len(PACKED_DATA ))
print(PACKED_DATA.unpack(string))
This looks like a usecase of a mixed radix numeral system.
A quick proof of concept:
num_states = [3, 5, 129, 15, 6, 2]
input_data = [2, 3, 78, 9, 0, 1]
print("Input data: %s" % input_data)
To encode, you start with a 0, and for each state first multiply by number of states, and then add the current state:
encoded = 0
for i in range(len(num_states)):
encoded *= num_states[i]
encoded += input_data[i]
print("Encoded: %d" % encoded)
To decode, you go in reverse, and get remainder of division by number of states, and then divide by number of states:
decoded_data = []
for n in reversed(num_states):
v = encoded % n
encoded = encoded // n
decoded_data.insert(0, v)
print("Decoded data: %s" % decoded_data)
Example output:
Input data: [2, 3, 78, 9, 0, 1]
Encoded: 316009
Decoded data: [2, 3, 78, 9, 0, 1]
Another example with more values:
Input data: [2, 3, 78, 9, 0, 1, 84, 17, 4, 5, 30, 1]
Encoded: 14092575747751
Decoded data: [2L, 3L, 78L, 9L, 0L, 1L, 84L, 17L, 4L, 5L, 30L, 1L]