I am not getting any proper source to understand why we need to change rotational vector to rotational matrix [in the context of calculating angle between two ARUco markers]. We are using
rmat = cv2.Rodrigues(rvec)
rmat1 =cv2.Rodrigues(rvec1)
relative_rmat = rmat1@rmat.T
My questions are
I have tried to understand from Wikipedia. But I am getting more confused. It would be helpful if anyone can provide the source of the concept for both the questions
Translation of a rigid body (or tvec
as used in OpenCV conventions) lives in 3D Euclidean space. We call this the 'configuration space'.
Assume there is a rigid body at a point we call pos1
. A 3x1 vector pos1 = [x1, y1, z1]
completely defines its position uniquely. The term 'unique' means that there is no other way to define pos1
without [x1, y1, z1]
, and also if you go to [x1, y1, z1]
, it will always be pos1
.
Any attitude (a rotation of a rigid body around three-dimensional space) can be represented in many different ways. However, attitudes live in a place called 'Special Orthogonal Space or SO(3)'. This is the configuration space for rotations, and elements in this space are what you were referring to as 'rotation matrices'.
All other ways of defining a rotation like Euler angles, rotation vectors (rvec
in OpenCV), or quaternions are 'local parameterizations' of a given rotation matrix. So they have several issue, including not being unique at some rotations. Gimbal Lock wiki page has some nice visualizations.
To avoid such issues, the simplest way is to use rotation matrices. Even though it seems complex, rotation matrices can be so much easier to work with once you get used to it. Given the properties of SO(3)
, the inverse of a rotation matrix is the transpose of that matrix (which is why you get the rmat.T
when you are trying to get the relative rotation).
Let's assume that you have two markers named 'marker' and 'marker1' that corresponds to rvec
and rvec1
respectively. Your rmat
is the rotation of 'marker' with respect to the camera frame, or how a vector in 'marker' can be represented in the camera frame (I know this can be confusing, but this is how this is defined, so stay with me).
Similarly, rmat1
is how a vector in 'marker1' is represented in the camera frame. Also, keep in mind that these matrices are directional, meaning we need to know inverse(rmat1
) if we need to find the how a vector in camera frame is represented in the 'marker1' frame.
Your relative_rmat
is how you represent a vector in 'marker' in 'marker1'. You cannot randomly hop on to different markers, and always need to go through a common place. First you have to transform a vector in 'marker' to camera frame, and then transform that to the 'marker1' frame. We can write that as
relative_rmat = rmat1 @ inverse(rmat)
But as I said above, a special property of a rotation matrix dictates that its inverse is as same as its transpose. So we write it as:
relative_rmat = rmat1 @ rmat.T
The order matter here and you should always start with the first rotation and pre-multiply subsequent rotations. If you want to go the other way, you just need to take the inverse of the relative_rmat
. So with simple matrix math, we can see that it as same as if we follow the rotations as I described above manually.
relative_rmat_inverse
= relative_rmat.T
= (rmat1 @ rmat.T).T
= (rmat.T).T @ rmat1.T
= rmat @ rmat1.T
It is hard to detail everything here, and it can take a while to understand the math behind the rotation matrices. I would recommend this as a good reference, but it can get very technical depending on your background. If you are new to this, start with basics of robotics, coordinate transformations, and then move to SO(3).