I'm working with big file manipulation (over 2Gb) and I have a lot of processing functions to deal with the data. My problem is that it is taking a lot (A LOT) of time to finish the processing. From all function the one that seems to take longer is this one:
def BinLsb(data):
Len = len(data)
databin = [0] * (int(Len))
num_of_bits = 8
###convert to bin the octets and LSB first
for i in range(Len):
newdatabin = bin(int(data[i], 16))[2:].zfill(num_of_bits)[::-1]
databin[i] = newdatabin
###group the 14bit and LSB again
databin = ''.join(databin)
composite_list = [databin[x:x + 14] for x in range(0, len(databin), 14)]
LenComp = len(composite_list)
for i in range(LenComp):
composite_list[i] = (int(str(composite_list[i])[::-1], 2))
return composite_list
I'd really appreciate some performance tips / another approach to this algorithm in order to save me some time. Thanks in advance!
basic analysis of your function: time complexity: 3O(n) space complexity: 3O(n). because your loop 3 times; my suggestion is loop once, use generator, which will cost 1/3 of time and space.
I upgraded your code and remove some useless variable using a generator:
def binLsb(data):
databin = ""
num_of_bits = 8
for i in range(len(data)):
newdatabin = bin(int(data[i], 16))[2:].zfill(num_of_bits)[::-1]
while len(str(databin)) > 14:
yield (int(str(databin[:14])[::-1], 2))
databin = databin[14:]
databin += str(newdatabin)
enjoy
Oliver