python opencv computer-vision camera-calibration stereo-3d

Epipolar line using essential matrix for calibrated camera is wrong

I wanted to get the essential or fundamental matrix for calibrated cameras (R, T, Ks are given).

There are two cameras. Is this step correct to get the essential matrix and epipolar line?. The result I have is totally wrong.

# P1 - first transformation matrix world to cam1
# P2 - second transformation matrix world to cam2



def skew(x):
    x = x.flatten()
    return np.array([[0, -x[2], x[1]],
                    [x[2], 0, -x[0]],
                    [-x[1], x[0], 0]])

P21 = np.linalg.inv(P2) @ P1



R = P21[:3, :3]
T = P21[:3, 3]

# projection matrices
P1 = np.c_[np.eye(3), np.zeros(3)]
P2 = np.c_[R, T]

#essential matrix
E1 = skew(T) @ P2 @ np.linalg.pinv(P1)

pt1=(x1,y1)

pt1=undistort((x1,y1))

Pt1=[(x1-cx)/fx, (y1-cy)/fy]

Pt1h=[pt1[0], pt[1], 1.0]

Epliline1= E1@pt1h

Are those steps above correct if I want to use distortion parameters?

Here I will give the details of the cameras:

Camera1.

K1 = array([[1.13423076e+03, 0.00000000e+00, 1.02850079e+03],
       [0.00000000e+00, 8.25907478e+02, 5.63612899e+02],
       [0.00000000e+00, 0.00000000e+00, 1.00000000e+00]])
dist1 = array([[-0.3921685 ,  0.18873927, -0.00285428, -0.00478303, -0.0478105 ]])
R1: array([[ 0.87748302,  0.47592599, -0.05931269],
       [-0.306075  ,  0.65090299,  0.69472541],
       [ 0.36924469, -0.59145562,  0.71682537]])
T1: array([[-1.86579737],
           [ 0.43997991],
           [ 2.80324197]])

Camera2.

K2: array([[1.13968506e+03, 0.00000000e+00, 9.85549834e+02],
       [0.00000000e+00, 8.26601887e+02, 5.86985726e+02],
       [0.00000000e+00, 0.00000000e+00, 1.00000000e+00]]), 
dist2: array([[-3.80880220e-01,  1.83123508e-01, -5.59300663e-04,
         1.97688021e-04, -5.11559635e-02]])
R2 : array([[-0.99235374,  0.12149541, -0.0217467 ],
       [-0.10731501, -0.76228259,  0.63828578],
       [ 0.06097166,  0.63573903,  0.76949226]]), 
T2 : array([[0.5810168 ],
       [1.86135225],
       [1.6348977 ]])

Solution

Your skew function is simply the cross product matrix. Like OpenCV does, I initially move the world origin in the first camera (so I only have R and T relative to the second camera), and then I use the following code to calculate the fundamental matrix F:

# Alternative formula by
# Multiple View Geometry in Computer Vision, by Richard Hartley and Andrew Zisserman
vv = skew( K1.dot(R.T).dot(T) )
F = (np.linalg.inv(K2).T).dot(self.R).dot(K1.T).dot(vv)

Then the essential matrix is obtainable like:

E = K2.T.dot(F).dot(K1)

The results are:

F [[ 1.13278823e-01 -1.35253176e+00 -4.88980254e+02]
 [-1.06844252e+00  4.04566728e-01  1.88300039e+02]
 [-5.54697063e+02  1.71231655e+03  6.35764837e+05]]

E [[  146431.66676676 -1273103.54255988 -1293288.66957507]
 [-1001726.06809778   276196.36212759  -564217.70767202]
 [-1213871.5038731    509423.0312259   -488699.08411468]]

I got sick of writing such boilerplate, so I came up with a simple library to manage stereo rigs. The full code I used is the following:

import numpy as np
import simplestereo as ss

if __name__ == "__main__":
    
    res1 = (1280,720) # Camera resolution is required by initialisation 
    res2 = (1280,720) # but should not change F end E results.
    
    # Raw camera 1 parameters (an NumPy arrays)
    K1 = np.array([[1.13423076e+03, 0.00000000e+00, 1.02850079e+03],
           [0.00000000e+00, 8.25907478e+02, 5.63612899e+02],
           [0.00000000e+00, 0.00000000e+00, 1.00000000e+00]])
           
    dist1 = np.array([[-0.3921685 ,  0.18873927, -0.00285428, -0.00478303, -0.0478105 ]])
    
    R1 = np.array([[ 0.87748302,  0.47592599, -0.05931269],
           [-0.306075  ,  0.65090299,  0.69472541],
           [ 0.36924469, -0.59145562,  0.71682537]])
    
    T1 = np.array([[-1.86579737],
               [ 0.43997991],
               [ 2.80324197]])
    
    # Camera 2           
    K2 = np.array([[1.13968506e+03, 0.00000000e+00, 9.85549834e+02],
           [0.00000000e+00, 8.26601887e+02, 5.86985726e+02],
           [0.00000000e+00, 0.00000000e+00, 1.00000000e+00]])
            
    dist2 = np.array([[-3.80880220e-01,  1.83123508e-01, -5.59300663e-04,
             1.97688021e-04, -5.11559635e-02]])
    
    R2 = np.array([[-0.99235374,  0.12149541, -0.0217467 ],
           [-0.10731501, -0.76228259,  0.63828578],
           [ 0.06097166,  0.63573903,  0.76949226]])
            
    T2 = np.array([[0.5810168 ],
           [1.86135225],
           [1.6348977 ]])

    
    # As a convention, the world origin must be in the left camera.
    # Move the world origin into the first camera (IMPORTANT)
    R, T = ss.utils.moveExtrinsicOriginToFirstCamera(R1, R2, T1, T2)
    
    # Create the StereoRig
    rig = ss.StereoRig(res1, res2, K1, K2, dist1, dist2, R, T) 

    print("F", rig.getFundamentalMatrix())
    print("E", rig.getEssentialMatrix())