computer-visionhomographyimage-stitchingpanoramascamera-intrinsics

Compute Homography Matrix from Intrinsic and Extrinsic Matrices


I have Intrinsic (K) and Extrinsic ([R|t]) matrices from camera calibration. How do I compute homography matrix (H)?

I have tried H = K [R|t] with z-component of R matrix = 0 and updating H matrix such that the destination image points lie completely within the frame but it didn't give desired H. Actually I am trying to stitch multiple images using Homography given intinsic and extrinsic matrices. When done with feature matching and then Homography computation the result is completely fine but I need to calculate matrix H from K and [R|t] matrices.


Solution

  • There seems to be some confusion. If you are using a homography to map images onto each other, then you are assuming the camera motion between them is a pure rotation.

    If this rotation is given, e.g. as a rotation matrix R, then the homography is simply H = K * R * inv(K). If it isn't, you must estimate it from the images. The simplest case is probably pan-tilt motion (think camera on a tripod). For this case you only need one point match between each image pair.

    EDIT: Refinement of initial Homography Solution.

    You should also look into bundle adjustment - e.g. using the excellent Ceres solver. A good (if a bit dated) introduction to B.A. is https://lear.inrialpes.fr/pubs/2000/TMHF00/Triggs-va99.pdf .

    For image stitching, the basic idea is to introduce for each matched image point pair (you may need tens/hundreds, well spread across the image area) an auxiliary 3D point lying on a plane, i.e. with one coordinate equal to zero. You then jointly optimize the camera parameters (intrinsic and extrinsic) and 3D point locations to minimize the reprojection error of the 3D points into the image points you have matched. Once you have a solution, you have choices to make:

    1. If the scene is far away from the camera or the camera translation between images can be ignored, you can "convert" the camera rotations between images into homographies, and use them to stitch.
    2. If the camera translations cannot be ignored, things quickly become a lot more complicated. If there are significant visible occlusions (regions of the scene in visible in one image but not in others), then no pure-stitching method can resolve them in general without the use of a 3D model of the scene. You can sometime use approximations, for example subdividing the scene into approximately planar patches.