computer-visionhomographyimage-stitching360-panoramaprojective-geometry

Compute Homography Matrix based on intrinsic and extrinsic camera parameters


I am willing to perform a 360° Panorama stitching for 6 fish-eye cameras.

In order to find the relation among cameras I need to compute the Homography Matrix. The latter is usually computed by finding features in the images and matching them.

However, for my camera setup I already know:

Therefore, I think I could manually compute the Homography Matrix, which I am assuming would result in a more accurate approach than performing feature matching.

In the literature I found the following formula to compute the homography Matrix which relates image 2 to image 1:

H_2_1 = (K_2) * (R_2)^-1 * R_1 * K_1

This formula only takes into account a rotation angle among the cameras but not the translation vector that exists in my case.

How could I plug the translation t of each camera in the computation of H?

I have already tried to compute H without considering the translation, but as d>1 meter, the images are not accurate aligned in the panorama picture.

EDIT:

Based on Francesco's answer below, I got the following questions:


Solution

  • You cannot "plug" the translation in: its presence along with a nontrivial rotation mathematically implies that the relationship between images is not a homography.

    However, if the imaged scene is and appears "far enough" from the camera, i.e. if the translations between cameras are small compared to the distances of the scene objects from the cameras, and the cameras' focal lengths are small enough, then you may use the homography induced by a pure rotation as an approximation.

    Your equation is wrong. The correct formula is obtained as follows:

    The product H = K2 * R_2_1 * inv(K1) is the homography induced by the pure rotation R_2_1. The rotation transforms points into frame 2 from frame 1. It is represented by a 3x3 matrix whose columns are the components of the x, y, z axes of frame 1 decomposed in frame 2. If your setup gives you the rotations of all the cameras with respect to a common frame 0, i.e. as R_i_0, then it is R_2_1 = R_2_0 * R_1_0.transposed.

    Generally speaking, you should use the above homography as an initial estimation, to be refined by matching points and optimizing. This is because (a) the homography model itself is only an approximation (since it ignores the translation), and (b) the rotations given by the mechanical setup (even a calibrated one) are affected by errors. Using matched pixels to optimize the transformation will minimize the errors where it matters, on the image, rather than in an abstract rotation space.