I have the 3x3
intrinsics
and 4x3
extrinsics
matrices for my camera obtained via cv2.calibrateCamera()
Now I want to use these paramenters to compute the BEV (Bird Eye View)
transformation for any given coordinates in a frame obtained from the camera.
Which openCv
function can be used to compute the BEV
perspective transformation for given point coordinates and the camera extrinsics
and/or intrinsics
3x3 matrices
?
I found something very related in the following post: https://deepnote.com/article/social-distancing-detector/ based on https://www.pyimagesearch.com/2014/08/25/4-point-opencv-getperspective-transform-example/ ,
they are using cv2.getPerspectiveTransform()
to get a 3X3 matrix
, but I don't know whether this matrix represents the intrinsics
, the extrinsecs
or something else. Then they are transforming the list of points using such matrix in the following way:
#Assuming list_downoids is the list of points to be transformed and matrix is the one obtained above
list_points_to_detect = np.float32(list_downoids).reshape(-1, 1, 2)
transformed_points = cv2.perspectiveTransform(list_points_to_detect, matrix)
I really need to know if I can use this cv2.perspectiveTransform
function to compute the transformation or if there's another better way to do this using the extrinsics
, the intrinsics
or both, without having to reuse the frame, since I already have the detected/selected coordinates saved in an array.
After a deep investigation, I found out a good solution:
The projection matrix
is a multiplication between theextrinsic
and the intrinsic
camera matrices
extrinsic
is a 4x3
matrix and the intrinsec
is a 3x3
matrix, but we need the projection matrix
to be a 3x3
matrix, then we need to convert the extrinsic
into 3x3
before performing the multiplication.cv2.getPerspectiveTransform()
gives us the Projection Matrix
when we don't have the camera params:
cv2.warpPerspective()
transforms the image itsef.
For the problem above we don't need these two functions since we already have the extrinsics
, the intrinsecs
and the coordinates of the points in the image.
Considering the presented above, I wrote a function to transform into BEV
a list o points list_x_y
given the intrinsics
and the extrinsics
:
def compute_point_perspective_transformation(intrinsics, extrinsics, point_x_y):
"""Auxiliary function to project a specific point to BEV
Parameters
----------
intrinsics (array) : The camera intrinsics matrix
extrinsics (array) : The camera extrinsics matrix
point_x_y (tuple[x,y]) : The coordinates of the point to be projected to BEV
Returns
----------
tuple[x,y] : the projection of the point
"""
# Using the camera calibration for Bird Eye View
intrinsics_matrix = np.array(intrinsics, dtype='float32')
#In the intrinsics we have parameters such as focal length and the principal point
extrinsics_matrix = np.array(extrinsics, dtype='float32')
#The extrinsic matrix stores the position of the camera in global space
#The 1st 3 columns represents the rotation matrix and the last is a translation vector
extrinsics = extrinsics[:, [0, 1, 3]]
#We removed the 3rd column of the extrinsics because it represents the z coordinate (0)
projection_matrix = np.matmul(intrinsics_matrix, extrinsics_matrix)
# Compute the new coordinates of our points - cv2.perspectiveTransform expects shape 3
list_points_to_detect = np.array([[point_x_y]], dtype=np.float32)
transformed_points = cv2.perspectiveTransform(list_points_to_detect, projection_matrix)
return transformed_points[0][0][0], transformed_points[0][0][1]