coordinatesprojection-matrixlidarkitti

Projection of 3D Lidar point in the i-th camera image (KITTI Dataset)


I am working on an object classification problem, and I am using Lidar and camera data from the Kitti Dataset. In this article: http://www.cvlibs.net/publications/Geiger2013IJRR.pdf, they provide the formulas for projecting the 3d PointCloud into the i-th camera image plane, but I don't understand some things :

Following equation((3) :

If the 3D point X is in velodyne camera image and Y in the i'th camera image, why X has four coordinates and Y three? It should have been 3 and 2, no?

Formula
(source: noelshack.com)

I need to project the 3D point Cloud into the camera image plane for then creating lidar images to use them as a channel for the CNN. Anyone who has ideas for it ?

Thank you in advance


Solution

  • For your first query regarding x and y dimension there are two explanation.

    Reason 1.

    Reason 2.

    Now coming to your second question of LiDAR image fusion, It requires intrinsic and extrinsic parameters (relative rotation and translation) and camera matrix. This rotation and translation forms a 3x4 matrix called as transformation matrix. So the point fusion equations becomes

    [x y 1]^T = Transformation Matrix * Camera Matrix * [X Y Z 1]^T
    

    You can also refer :: Lidar Image Fusion KITTI

    Once your LiDAR image fusion is done, you can input this image to your CNN model.I am not aware of DNN modules for LiDAR fused image.

    Hope this helps..