copencvimage-processingcomputer-visionslam

Robustly finding the local maximum of an image patch with sub-pixel accuracy


I am developing a SLAM algorithm in C, and I have implemented the FAST corner finding method which gives me some strong keypoints in the image. The next step is to get the center of the keypoints with a sub-pixel accuracy, therefore I extract a 3x3 patch around each of them, and do a Least Squares fit of a two dimensional quadratic:

quad

Where f(x,y) is the corner saliency measure of each pixel, similar to the FAST score proposed on the original paper, but modified to also provide a saliency measure in non corner pixels.

And the least squares:

z

z2

b

X

bhat

With being the estimated parameters.
I can now calculate the location of the peak of the fitted quadratic, by taking the gradient equal to zero, achieving my original goal.

The issue arises on some corner cases, where the local peak is closer to the edge of the window, resulting in a fit with low residuals but a peak of the quadratic way outside the window.

An example:
The corner saliency and a contour of the fitted quadratic:

enter image description here

The saliency (blue) and fit (red) as 3D meshes:

enter image description here

Numeric values of this example are (row-major ordering):

[336, 522, 483, 423, 539, 153, 221, 412, 234]

And the resulting sub pixel center of (2.6, -17.1) being wrong.

How can I constrain the fit so the center is within the window?
I'm open to alternative methods for finding the sub pixel peak.


Solution

  • I tried my own code to fit a 2D quadratic function to the 3x3 values, using a stable least-squares solving algorithm, and also found a maximum outside of the domain. The 3x3 patch of data does not match a quadratic function, and therefore the fit is not useful.

    Fitting a 2D quadratic to a 3x3 neighborhood requires a degree of smoothness in the data that you don't seem to have in your FAST output.

    There are many other methods to find the sub-pixel location of the maximum. One that I like because it is more stable and less computationally intensive is the fitting of a "separable" quadratic function. In short, you fit a quadratic function to the three values around the local maximum in one dimension, and then another in the other dimension. Instead of solving 6 parameters with 9 values, this solves 3 parameters with 3 values, twice. The solution is guaranteed stable, as long as the center pixel is larger or equal to all pixels in the 4-connected neighborhood.

    z1 = [f(-1,0), f(0,0), f(1,0)]^T
    
        [1,-1,0]
    X = [0,0,0]
        [1,1,0]
    
    solve: X b1 = z1
    

    and

    z2 = [f(0,-1), f(0,0), f(0,1)]^T
    
        [1,-1,0]
    X = [0,0,0]
        [1,1,0]
    
    solve: X b2 = z2
    

    Now you get the x-coordinate of the centroid from b1 and the y-coordinate from b2.