[SOLVED] Convert 32 bytes binary big endian file (LiDAR data) to python list or array

Convert 32 bytes binary big endian file (LiDAR data) to python list or array

I have a LiDAR data set that is in 32 bytes binary big endian format and I need to convert it to python list or array then convert it to a PCD file. I'm currently using the following code, but it is only for 16 byte.

What modification should I make to let the code work for 32 byte big endian file? This is a link to the file that I'm working with.

import open3d as o3d
import numpy as np
import os
import sys
import struct

size_float = 4
list_pcd = []
with open ("C:\\Users\\wilso\\python\\datasets\\DOTX182013031901004142612.log", "rb") as f:
    byte = f.read(size_float*4)
    while byte:
        x,y,z,intensity = struct.unpack("ffff", byte)
        list_pcd.append([x, y, z])
        byte = f.read(size_float*4)
np_pcd = np.asarray(list_pcd)
pcd = o3d.geometry.PointCloud()
v3d = o3d.utility.Vector3dVector
pcd.points = v3d(np_pcd)
o3d.io.write_point_cloud("copy_of_fragment.pcd", pcd)

Solution

Based on a downloaded copy of the file that you linked to, it seems that your code is already set up for the correct data length. (For more details about this, see below.) The problem is that you are not telling it to use big-endian. In struct.unpack, > can be used for this - see byte order. size and alignment.

If you change your "ffff" to ">ffff" in your program, then it will work.

Then instead of getting numbers like:

1.5583606204912748e-38 -112.75440216064453 8.758058715979973e+18
5.859210099898786e-23 7344.03173828125 44007040221184.0
2.734360572280704e+35 2.1044305180549755e+30 6.728572770953178e-05
862.4961547851562 -1167176.125 -9.643602918084717e+20

you will see numbers like:

-22.08251953125 16.360233306884766 -2.3429789543151855
-21.318897247314453 16.111948013305664 -2.3769736289978027
-20.665271759033203 15.926865577697754 -2.4304943084716797
-19.91761016845703 15.659859657287598 -2.442497730255127

I think that some of the confusion is that you are (mis)understanding the ffff to mean 16-bits, as if each f represented a 4-bit hexadecimal digit. It does not mean this. Each f stands for "float", which means a 32-bit floating point number, and there are four such numbers (x,y,z,intensity) so there are four fs. For example if there were three 64-bit (i.e. double-precision) numbers then this would be ddd. See: list of format characters