I am trying to read 8-bit integer values from a file into a numpy
array and convert them into 32-bit floats. I really need to write these 32-bit floats to a particular memory buffer (I am using shared memory for multiprocessing). I am trying to figure out if there is a way to do the type conversion straight into my pre-allocated buffer for 32-bit floats without creating a temporary array.
For instance, if there was an astype
method with an output argument, like:
num_samples = 100
raw_buffer = bytearray(num_samples)
with open(filepath, 'rb') as f:
f.readinto(raw_buffer)
int_array = np.frombuffer(raw_buffer, dtype=np.int8)
float_buffer = bytearray(num_samples * 4)
int_array.astype(np.float32, out=float_buffer)
Is there a way to do something like this?
I am not sure this is what you expected but I am pretty sure the result is the one you want.
a=np.arange(3, dtype=np.int8)
buf=bytearray(12)
b=np.frombuffer(buf, dtype=np.float32)
b[:]=a
# Note that it does write in the buffer
buf
# bytearray(b'\x00\x00`A\x00\x00pA\x00\x00\x80A')
import struct
struct.unpack('fff', buf)
# (14.0, 15.0, 16.0)
And from a timing perspective, on my PC, just running a.astype(np.float32)
on a 10000 byte array takes 4.6 μs. And running b[:]=a
takes 4.1 μs. So if unecessary copy is your concern, apparently, that solution isn't doing more operations than astype
.
Note, that replacing a.astype(np.float32)
by a.astype(np.float32).copy()
(just to estimate how much an extra copy cost) make the timing 8.2 μs.
So apparently, a copy cost 4 μs. So I don't think there is any unnecessary copy in my b[:]=a
, or else it wouldn't cost only 4.1 μs (there is the implicit copy/conversion from the uint8 buffer to the float32 buffer, but that one can't be avoided, and you didn't expected it to be avoided. So I am quite confident that nothing more than your virtual astype(dtype=np.float32, out=buffer)
would do is happening here.