structendianness

How to read mixed-endian data from file using Python and numpy?


I have a file where 32-bit float values are stored with standard, little-endian byte order, but with the high-word and low-word of the 32 bits swapped.

I.e. a number that would read 0xDEADBEEF when written as an IEEE-754 float (value approx. 6.26E+18) is stored in the order of 0xBEEFDEAD.

I figured out two ways to do this, but both seem unnecessarily complicated to me:

import numpy as np

# Starting by reading the data as 16-bit words seems to be a good start:
words_in = np.fromfile("data.bin", dtype="uint16")

# Reverse word order, interpret as float and again reverse order of values:
words_backwards = words_in[-1::-1].copy()
float_values_backwards = words_backwards.view(">f4")
float_values = float_values_backwards[-1::-1]

# Above written in one line:
float_values = words_in[-1::-1].copy().view(">f4")[-1::-1]


# Or using reshaping and transpose operations:
float_values = words_in.reshape(-1, 2).T[-1::-1].T.reshape(-1).view(">f4")

Is there an easier or more intuitive way to swap word order but retain byte order in Python?


Solution

  • Byte swap the 16-bit array and view as little-endian float:

    import numpy as np
    
    data = bytes.fromhex('beefdead')
    org = np.frombuffer(data, dtype=np.uint16)
    fix = org.byteswap().view('<f4')
    print(fix)
    

    Output:

    [-6.2598534e+18]