I'm trying to read image data in from a CBOR file into a Numpy array.
Ideally I'm looking for a more efficient way to read convert the bytes from two's compliment to unsigned and then read the image data into a numpy array.
I experimented with a few different ways to convert and read the bytes but wasn't able to improve the speed by a significant margin.
Originally I was using a for loop to convert the bytes (1. below) then I used numpy with modulo (2. below) and then moved to selective addition (3. below).
My full functions are below as well.
1) for x in data:
new_byte = x%256
2) ndarray%256
3) image[image<0] += 256
import os
from cbor2 import dumps, loads, decoder
import numpy as np
import itertools
def decode_image_bytes(image_byte_array):
"""Input: 1-D list of 16 bit two's compliment bytes
Operations: Converts the bytes to unsigned and decodes them
Output: a 1-D array of 16-bit image data"""
# Convert input to numpy array
image = np.array(image_byte_array)
# Convert two's complement bytes to unsigned
image[image<0] += 256
# Split the unsigned bytes into segments
bytes_array=np.array_split(image,(len(image)/2))
holder = list()
# Convert segements into integer values
for x in bytes_array:
holder.append(int.from_bytes(list(x), byteorder='big', signed=False))
return holder
def decode_image_metadata(image_dimensions_bytes_array):
"""Input: 1-D list of sint64 two's complement bytes
Operations: Converts bytes to unsigned and decodes them
Output: Dictionary with possible values: 'width, height, channels, Z, time'"""
# Convert input to numpy array
dimensions = np.array(image_dimensions_bytes_array)
# Covert two's complement bytes to unsigned
dimensions[dimensions<0] += 256
# Split the unsigned bytes into segements
bytes_array=np.array_split(dimensions,(len(dimensions)/8))
# Convert the segments into integer values
for x in range(0, len(bytes_array)):
bytes_array[x]=int.from_bytes(list(bytes_array[x]), byteorder='big', signed=True)
# Put the converted integer values into a dictionary
end = dict(itertools.zip_longest(['width', 'height', 'channels', 'Z', 'time'], bytes_array, fillvalue=None))
return end
Right now it takes 20-30 seconds to convert the bytes and return the Numpy array. I'd like to cut that in half if possible.
Right now I've come up with using to eliminate the for loops. Is there a better method?
bytes_array = np.apply_along_axis(metadata_values, 1, bytes_array)
def metadata_values(element):
return int.from_bytes(element, byteorder='big', signed=True)
Unless you're doing it for your own edificaion, you shouldn't be writing your own conversion between binary number representations, as it will be orders of magnitude slower.
Here is an example of reading bytes into a numpy array of various formats:
>>> b = bytes([0,1,127,128,255,254]) #equivelant to reading bytes from a file in binary mode
>>> np.frombuffer(b, dtype=np.uint8)
array([ 0, 1, 127, 128, 255, 254], dtype=uint8) #notice the *U*int vs int
>>> np.frombuffer(b, dtype=np.int8)
array([ 0, 1, 127, -128, -1, -2], dtype=int8)
>>> #you can also specify other than 1 byte data formats as long as you have the right amount of bytes
>>> np.frombuffer(b, dtype=np.int16)
array([ 256, -32641, -257], dtype=int16)
>>> np.frombuffer(b, dtype=np.uint16)
array([ 256, 32895, 65279], dtype=uint16)