pythonopencvgoogle-maps-markersorientationaruco

Fundamental understanding of tvecs rvecs in OpenCV-ArUco


I want to use ArUco to find the "space coordinates" of a marker. I have problems understanding the tvecs and rvecs. I came so far as to the tvecs are the translation and the rvecs are for rotation. But how are they oriented, in which order are they written in the code, or how do I orient them?

Here is a little sketch of the setup.

I have a camera (laptop webcam just drawn to illustrate the orientation of the camera) at the position X,Y,Z, the Camera is oriented, which can be described with angle a around X, angle b around Y, angle c around Z (angles in Rad).

So if my camera is stationary I would take different pictures of the ChArUco Boards and give the camera calibration algorithm the tvecs_camerapos (Z,Y,X) and the rvecs_camerapos (c,b,a). I get the cameraMatrix, distCoeffs and tvecs_cameracalib, rvecs_cameracalib. t/rvecs_camerapos and t/rvecs_cameracalib are different which I find weird.

I think t/rvecs_cameracalib is negligible because I am only interested in the intrinsic parameters of the camera calibration algorithm.

Now I want to find the X,Y,Z position of the marker, I use aruco.estimatePoseSingleMarkers with t/rvecs_camerapos and retrive t/rvecs_markerpos. The tvecs_markerpos don't match my expected values.


Solution

  • OpenCV routines that deal with cameras and camera calibration (including AruCo) use a pinhole camera model. The world origin is defined as the centre of projection of the camera model (where all light rays entering the camera converge), the Z axis is defined as the optical axis of the camera model, and the X and Y axes form an orthogonal system with Z. +Z is in front of the camera, +X is to the right, and +Y is down. All AruCo coordinates are defined in this coordinate system. That explains why your "camera" tvecs and rvecs change: they do not define your camera's position in some world coordinate system, but rather the markers' positions relative to your camera.

    You don't really need to know how the camera calibration algorithm works, other than that it will give you a camera matrix and some lens distortion parameters, which you use as input to other AruCo and OpenCV routines.

    Once you have calibration data, you can use AruCo to identify markers and return their positions and orientations in the 3D coordinate system defined by your camera, with correct compensation for the distortion of your camera lens. This is adequate to do, for example, augmented reality using OpenGL on top of the video feed from your camera.