I'm in the middle of a school project on computer vision that consists on two parts. The first part is the semantic segmentation of a ground picture (small picture) and the second part consists in being able to locate that small picture in a pre-loaded and pre-segmented map (big picture) with the output being the coordinates and the orientation of the small picture inside the big one.
The first part is already done and working fine, but I have no idea how to approach the second part of the project. When the small picture's orientation is the same as in the original map, I can easily find it using brute force, but problems start when the small image is rotated respect to the original map.
I have no idea how to approach this problem, any word, topic or algorithms I could use to look for more information online would be appreciated :)
I'm working on Matlab with the Deep learning and computer vision toolboxes, but I could easily change to Python if needed or if it could be substantially easier.
Thanks to everyone reading this!
By the word "brute force", I don't understand what you mean. If you provide more detail may be I be able to provide more details or more alogrithms. However if you want to find search image in side same or another image you can use these algorithms:
- SIFT
- SURF
- ORB
- BRISK
- FREAK
- Siamese networks
Most of these algorithms (expect last one) try to find some key points
that are robust against rotations, noise, brightness variations, blur, ... and finally match them
using distance measurement such as hamming, euclidean, manhatan, ....
I my self the prefer last one in terms of accuracy and not needing playing with too many hyper-parameters. For Siamese networks
you need training
. It means labeling and GPU. SIFT
and SURF
are famous for image matching. For more details you can read their main articles. I wrote a paper on copy-move forgery
that finds copy-pasting an part of image for fraud/forgery purpose. You can find a lot of approaches for your purpose from papers of this field.