I’m working on an OCR system for Oman license plates and struggling to improve the accuracy of alphabet recognition. The plates often include small, bold characters, and my current preprocessing pipeline isn’t yielding satisfactory results.
so far what i have done is :
PaddleOCR and Tesseract struggled with alphabet recognition, despite preprocessing and configurations (--oem 3, --psm 6). Preprocessing Steps: tried with Sauvola and Wolf-Jolion binarization, scaled images (1.5x), and applied dilation to enhance text. Issue:
Alphabets remain challenging to recognize.
How can I improve preprocessing for better OCR recognition of small, bold alphabets? Are there any OCR models or custom training approaches better suited for license plates with intricate designs like Oman’s?
sample license plate:
I have worked on many OCR architectures. I have even trained ANPR for 24 different European languages and ANPR for India. Basically there is no pre-trained recognizer for OCR of license plates. You need to train a model for that. You can choose multiple OCR architectures and see which one is the best for your application. For training the OCR model, you need to first prepare a dataset of license plate images. You can scrape online images, crop them from CCTV cameras on roads, label them manually or from OCR service like amazon or google and create a good quality dataset.