I'm working on a program on hololens2 research mode on unity. Hololens give us a depth image that is distance from depth sensor to object in front, for every pixel.
What I do is for every pixel I project pixel to image plane, then backproject it according to depth distance captured by depth sensor and it gives me the xyz in depth sensor coordinate frame. now it is needed to transform this coordinate to global coordinate system. to do so I get camera coordinate from unity by cam_pose = Camera.main.transform
and in the other hand saved depth sensor extrinsic matrix.
From these two matrices I create a depth_to_world = cam_pose @ inv(extrinsic)
. Now for every xyz on depth I perform global_xyz = depth_to_world @ xyz
to get point in real world. Problem is it return a point with 10-15 cm error. What am I doing wrong? (code is in python)
x = self.us[Depth_i, Depth_j] # projection from pixels to image plane
y = self.vs[Depth_i, Depth_j] # projection from pixels to image plane
D = distance_img[Depth_i, Depth_j] #distance_img is depth image
distance = 1000*float(D) / np.sqrt(x * x + y * y + 1) #distance according to spherical image plane D is in millimeter
depth_to_world = cam_pose @ np.linalg.inv(Constants.camera_extrinsic)
X = (np.array([x * distance, y * distance, 1.0 * distance, 1])).reshape(4, 1)
point = (depth_to_world @ X )[0:3, 0]
I got it! according to (https://github.com/petergu684/HoloLens2-ResearchMode-Unity) first I passed unity world origin to a winrt plugin, and depth_to_world was depth_to_world = inv(extrinsic) * cam_pose witch cam_pose is given by TryLocateAtTimeStamp. And other point is that unity coordinate is left handed (surprisingly!) so we should multiply a (-1) to z. (z <- -z) my depth_to_world transformation was near but not correct.