pythonopencvtriangulationstereo-3d

Determining 3d locations from two images using opencv - traingulatePoints units


Given a set of corresponding points bet two arbitrary (ie not parallel) images (eg as found by SURF) , I have used the following in an attempt to extract the 3D positions of the points.

    def triangulate(pts1,pts2):
        cameraMatrix = np.array([[1, 0,0],[0,1,0],[0,0,1]])        
        F,m1 = cv2.findFundamentalMat(pts1, pts2) # apparently not necessary
        
        # using the essential matrix can get you the rotation/translation bet. cameras, although there are two possible rotations: 
        E,m2 = cv2.findEssentialMat(pts1, pts2, cameraMatrix, cv2.RANSAC, 0.999, 1.0)
        Re1, Re2, t_E = cv2.decomposeEssentialMat(E)
  
        # recoverPose gets you an unambiguous R and t. One of the R's above does agree with the R determined here. RecoverPose can already triangulate, I check by hand below to compare results. 
        K_l = cameraMatrix
        K_r = cameraMatrix
        retval, R, t, mask2, triangulatedPoints = cv2.recoverPose(E,pts_l_norm, pts_r_norm, cameraMatrix,distanceThresh=0.5)

        # given R,t you can  explicitly find 3d locations using projection 
        M_r = np.concatenate((R,t),axis=1)
        M_l = np.concatenate((np.eye(3,3),np.zeros((3,1))),axis=1)
        proj_r = np.dot(cameraMatrix,M_r)
        proj_l = np.dot(cameraMatrix,M_l)
        points_4d_hom = cv2.triangulatePoints(proj_l, proj_r, np.expand_dims(pts1, axis=1), np.expand_dims(pts2, axis=1))
        points_4d = points_4d_hom / np.tile(point_s4d_hom[-1, :], (4, 1))
        points_3d = points_4d[:3, :].T
        return points_3d

I have assumed that my intrinsic camera matrices are approximately I in the above. The R,t as determined by two methods (findEssentialMat->decomposeEssentialMat vs recoverPose) agree, and the triangulated points as determined by two methods (recoverPose vs triangulatePoints) also agree.

My question concerns the values I see, which for points_3d are generally in the range 0-50 for x,y and 0-0.03 for z. As far as I know these values should be in pixels; has my choice of camera matrix=I affected the scale?


Solution

  • Yes, your choice of the camera matrix directly affects the scale. The camera matrix in OpenCV should contain the values for fx and fy which refer to the camera focal length (principal distance) expressed in pixel units - see OpenCV Camera model.
    If you set both values to 1 you will get "smaller values" in pixels for your 3D points. Usually, (clearly depending on the camera) the values for fx, fy are around e.g. 1000. Here you can find a nice example for estimating the focal length of a webcam using only the resolution and the approximate field of view (FOV).