pythonopencvimage-processingskew

Improving image deskew using Python and OpenCV


The code I've produce to detect and correct skew is giving me inconsistent results. I'm currently working on a project which utilizes OCR text extraction on images (via Python and OpenCV), so removing skew is key if accurate results are desired. My code uses cv2.minAreaRect to detect skew.

The images I'm using are all identical (and will be in the future) so I'm unsure as to what is causing these inconsistencies. I've included two sets of before and after images (including the skew value from cv2.minAreaRect) where I applied my code, one showing successul removal of skew and showing skew was not removed (looks like it added even more skew).

Image 1 Before (-87.88721466064453) Image 1 Before

Image 1 After (successful deskew) Image 1 After

Image 2 Before (-5.766754150390625) Image 2 Before

Image 2 After (unsuccessful deskew) Image 2 After

My code is below. Note: I've worked with many more images than those I've included here. The detected skew thus far has always been in the ranges [-10, 0) or (-90, -80], so I attempted to account for this in my code.

    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    img_gray = cv2.bitwise_not(img_gray)
    
    thresh = cv2.threshold(img_gray, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
    coords = np.column_stack(np.where(thresh > 0))
    angle = cv2.minAreaRect(coords)[-1] 
      
    if (angle < 0 and angle >= -10):
        angle = -angle #this was intended to undo skew for values in [-10, 0) by simply rotating using the opposite sign
    else:
        angle = (90 + angle)/2  
     
    (h, w) = img.shape[:2]
    center = (w // 2, h // 2)
    
    M = cv2.getRotationMatrix2D(center, angle, 1.0)
    deskewed = cv2.warpAffine(img, M, (w, h), flags = cv2.INTER_CUBIC, borderMode = cv2.BORDER_REPLICATE)

I've looked through various posts and articles to find an adequate solution, but have been unsuccessful. This post was the most helpful in understanding the skew values, but even then I couldn't get very far.


Solution

  • A very good text deskew tool can be found in Python Wand, which uses ImageMagick. It is based upon the Radon transform.

    Form 1:

    enter image description here

    Form 2:

    enter image description here

    from wand.image import Image
    from wand.display import display
    
    
    with Image(filename='form1.png') as img:
        img.deskew(0.4*img.quantum_range)
        img.save(filename='form1_deskew.png')
        display(img)
    
    with Image(filename='form2.png') as img:
        img.deskew(0.4*img.quantum_range)
        img.save(filename='form2_deskew.png')
        display(img)
    

    Form 1 deskewed:

    enter image description here

    Form 2 deskewed:

    enter image description here