pythonopencvimage-processing

Extract face rectangle from ID card


I’m researching the subject of extracting the information from ID cards and have found a suitable algorithm to locate the face on the front. As it is, OpenCV has Haar cascades for that, but I’m unsure what can be used to extract the full rectangle that person is in instead of just the face (as is done in https://github.com/deepc94/photo-id-ocr). The few ideas that I’m yet to test are:

  1. Find second largest rectangle that’s inside the card containing the face rect
  2. Do “explode” of the face rectangle until it hits the boundary
  3. Play around with filters to see what can be seen

What can be recommended to try here as well? Any thoughts, ideas or even existing examples are fine.


Solution

  • Normal approach:

    import cv2
    import numpy as np
    import matplotlib.pyplot as plt
    
    image = cv2.imread("a.jpg")
    
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    _,thresh = cv2.threshold(gray,128,255,cv2.THRESH_BINARY)
    cv2.imshow("thresh",thresh)
    
    thresh = cv2.bitwise_not(thresh)
    
    element = cv2.getStructuringElement(shape=cv2.MORPH_RECT, ksize=(7, 7))
    
    dilate = cv2.dilate(thresh,element,6)
    cv2.imshow("dilate",dilate)
    erode = cv2.erode(dilate,element,6)
    cv2.imshow("erode",erode)
    
    morph_img = thresh.copy()
    cv2.morphologyEx(src=erode, op=cv2.MORPH_CLOSE, kernel=element, dst=morph_img)
    cv2.imshow("morph_img",morph_img)
    
    _,contours,_ = cv2.findContours(morph_img,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
    
    areas = [cv2.contourArea(c) for c in contours]
    
    sorted_areas = np.sort(areas)
    cnt=contours[areas.index(sorted_areas[-3])] #the third biggest contour is the face
    r = cv2.boundingRect(cnt)
    cv2.rectangle(image,(r[0],r[1]),(r[0]+r[2],r[1]+r[3]),(0,0,255),2)
    
    cv2.imshow("img",image)
    cv2.waitKey(0)
    cv2.destroyAllWindows()
    

    I found the first two biggest contours are the boundary, the third biggest contour is the face. Result:

    enter image description here

    There is also another way to investigate the image, using sum of pixel values by axises:

    x_hist = np.sum(morph_img,axis=0).tolist() 
    plt.plot(x_hist)
    plt.ylabel('sum of pixel values by X-axis')
    plt.show()
    
    y_hist = np.sum(morph_img,axis=1).tolist()
    plt.plot(y_hist)
    plt.ylabel('sum of pixel values by Y-axis')
    plt.show()
    

    enter image description here enter image description here

    Base on those pixel sums over 2 asixes, you can crop the region you want by setting thresholds for it.