I have 20k small label images and each image has the word "Back" or "Front".
Image resolution is all (200px, 25px)
I can classify these images 100% accuracy with tesseract_OCR.
txt = pytesseract.image_to_string(img, lang='eng')
if "Front" in txt:
return "Front"
if "Back" in txt:
return "Back"
problem is, it is too slow(1 hour for 20k images) and need to install OCR packages.
I know even 3 layer Simple CNN also works well for it, but I think this problem seems to be solvable with simple algorithm without difficult techniques.
Can you recommend a new approach for me?
thank you.
I used the template_match function of OpenCV to compare whether the "Front" image saved as a sample matches the "Back" image.
I succeeded in reducing the time to about 1 minute.
https://docs.opencv.org/4.x/d4/dc6/tutorial_py_template_matching.html