swiftapple-vision

Find image within image (template matching?)


I need to find the location of an image that the user provides within an image that I provide.

It is safe to assume at the time of the analysis that the user provided image is certain to be contained within the image to be compared with.

I’ve looked through and even have some experience with Core ML and Vision image classification however I am struggling to convince myself that it is the correct way to approach this problem. I feel like the way “feature values” is handled in Vision it is almost the reverse of what I’m looking for.

My question: Is there a feature of Core ML or Vision that tackles this particular problem head on?

Other information that may be needed;

It is not safe to assume that images provided are pixel to pixel perfect due to possible resolution differences.

They may also be provided in any shape although possible to crop to a standardised shape before analysis.

Rotation will also need to be accounted for.

There would not be cases where the image is in the image twice.


Solution

  • Take a look at some of the feature detection and matching algorithms. For example, you could use SIFT (scale-invariant feature transform algorithm) with RANSAC (Random sample consensus algorithm) to do exactly what you described.

    If you are using OpenCV there are plenty of such algorithms which you can easily use. (FAST, Shi-Tomasi, etc.)