pythonopencvarucopose-estimationapriltags

Understanding openCV aruco marker detection/pose estimation in detail: subpixel accuracy


I am currently studying openCV's 'aruco' module, especially focusing on the poseEstimation of ArUco markers and AprilTags.

Looking into the subpixel accuracy, I have encountered a strange behaviour, which is demonstrated by the code below: If I do provide a 'perfect' calibration (e. g. cx/cy equals image center and distortion is set to zero) and a 'perfect' marker with known edge length, cv.detectMarkers will only yield the correct value, if the rotation is at 0/90/180 or 270 degrees. The subpixel routine yields an (almost) constant value for other orientations, yet at a "shifted" level. It is clear that at the specific angles of 0/90/180/270 degrees the pixel in the corner yields a sharp transition an can thus be detected with high precision. However I struggle to see where the underestimated length in all other cases originates from. Is this a bug or resulting from some trigonometry? --> Look at the graphs generated by the script below: The error in the pose results from the error in the corner detection. Thus the detection accuracy will depend on the orientation of the code.

I also checked ArUco markers and different subpixeling methods. The "peaks" remain, although the angular behaviour in between will change.

I am pretty sure, that this is not due to the interpolation related with the rotation of the marker, since I can observe the same behaviour in real data as well (yet note that the "height" of the peaks seems to depend somehow on the interpolation method. You can test this by changing the flag in cv.warpAffine e. g. to cv.INTER_LINEAR).

My questions would then be:

  1. Are the peaks due to a bug or is this expected behaviour?
  2. If the latter, could you help me understand why?
  3. Is there a way to eliminate this orientation dependency of the accuracy (other than increasing the system's resolution such, that no subpixeling is required)?

EDIT: Note that the AprilTag functions have been added to openCV only recently, so you will need to upgrade to the newest version which is not yet available on some standard repositories. You can e. g. get an up-to-date version on conda-forge. /EDIT

# -*- coding: utf-8 -*-

import numpy as np
import cv2 as cv
import pylab as plt

""" generate an "ideal" calibration with zero distortion and perfect alignment 
    of the main optical axis: """
cam_matrix = np.array([[1.0e+04, 0.00000000e+00, 1.22400000e+03],
       [0.00000000e+00, 1.0e+04, 1.02400000e+03],
       [0.00000000e+00, 0.00000000e+00, 1.00000000e+00]])
dist_coeffs = np.array([[0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.]])

# define detection parameters
marker_length = 30.00 # some arbitrary value
marker_length_px = 700
marker_id = 3
dictionary = cv.aruco.getPredefinedDictionary(cv.aruco.DICT_APRILTAG_16H5)
para = cv.aruco.DetectorParameters_create()
para.cornerRefinementMethod = cv.aruco.CORNER_REFINE_APRILTAG
para.aprilTagDeglitch = 0           
para.aprilTagMinWhiteBlackDiff = 30
para.aprilTagMaxLineFitMse = 20
para.aprilTagCriticalRad = 0.1745329201221466 *6
para.aprilTagMinClusterPixels = 5  
para.maxErroneousBitsInBorderRate = 0.35
para.errorCorrectionRate = 1.0                    
para.minMarkerPerimeterRate = 0.05                  
para.maxMarkerPerimeterRate = 4                  
para.polygonalApproxAccuracyRate = 0.05
para.minCornerDistanceRate = 0.05

marker_length_list = []
tvec_z_list = []

# generate pictures (AprilTag ID: 3 centered in image will be rotated by fixed angular steps, e. g. 10 degrees) 
degrees_list = np.linspace(0,350,36, dtype=np.int).tolist()
marker = cv.aruco.drawMarker(dictionary, marker_id, marker_length_px)
img = np.zeros((2048, 2448), np.uint8)+255
img[674:1374, 874:1574] = marker
cv.imshow("Original", img)
cv.imwrite("original.png", img)
rows, cols = img.shape

for entry in degrees_list:
    # rotate original picture
    rot_mat = cv.getRotationMatrix2D((((rows-1)/2),(cols-1)/2), entry, 1)
    rot_img = cv.warpAffine(img, rot_mat, (cols, rows), flags=cv.INTER_CUBIC) # interpolation changes the "peak amplitude" e.g. try cv.INTER_LINEAR instead 
    # detect marker and get pose estimate
    corners, ids, rejected = cv.aruco.detectMarkers(rot_img,dictionary,parameters=para)
    my_index = ids.tolist().index([marker_id])
    fCorners = corners[my_index]
    fRvec,fTvec, _obj_points = cv.aruco.estimatePoseSingleMarkers(fCorners, marker_length, cam_matrix, dist_coeffs)
    # calculate the respective edge length for each side
    L1 = abs(np.sqrt(np.square(fCorners[0][0][0]-fCorners[0][1][0])+np.square(fCorners[0][0][1]-fCorners[0][1][1])))
    L2 = abs(np.sqrt(np.square(fCorners[0][0][0]-fCorners[0][3][0])+np.square(fCorners[0][0][1]-fCorners[0][3][1])))
    L3 = abs(np.sqrt(np.square(fCorners[0][2][0]-fCorners[0][1][0])+np.square(fCorners[0][2][1]-fCorners[0][1][1])))
    L4 = abs(np.sqrt(np.square(fCorners[0][2][0]-fCorners[0][3][0])+np.square(fCorners[0][2][1]-fCorners[0][3][1])))
    mean_length = (L1+L2+L3+L4)/4
    # append results
    marker_length_list. append(mean_length)
    tvec_z_list.append(fTvec[0][0][2])
        
plt.figure("TVEC Z")
plt.plot(degrees_list, tvec_z_list, "go--")
plt.xlabel("marker rotation angle (°)")
plt.ylabel("TVEC Z (units of length)")

plt.figure("Mean marker length (should be 700)")
plt.plot(degrees_list, marker_length_list, "bo--")
plt.xlabel("marker rotation angle (°)")
plt.ylabel("marker length (pixels)")

EDIT2: As suggested by Christoph Rackwitz, here is the output generated by the script as pictures:

Marker length: Should ideally be 700: Marker length: Should ideally be 700

Marker distance: Should be invariant of rotation: Marker distance: Should be invariant of rotation


Solution

  • Answer by Christoph Rackwitz (see comments):

    you're getting a mere 2% of a pixel of difference (30 ppm in length, 30 ppm in Z). warpAffine uses fixed point math, 5 fractional bits. also: cubic interpolation. also: corner refinement method (that hopefully works on the edges, not the corners)... and this here is all happening in a "linear colorspace" (the warpaffine). if you gave this real pictures, those are gamma-mapped, and webcams apply sharpening filters, so you'll get even wilder results