pythonopencvimage-processingferet-value

Extracting the dimensions of a rectangle


So I'm working on an image processing project with sonar images. To be more specific, I am trying to extract the dimensions of an image of a pool taken by a sonar scanner. I was able to extract the rectangular region of the pool but I can not figure out how to get the dimensions of each edge in terms of pixels. I am working with OpenCV in Python.

I would appreciate if anyone would give me any suggestion on how this could be done.

I already tried finding the line intersections of hough lines but that did not have promising results.

Code so far.

import cv2 
import numpy as np 
from scipy import ndimage as ndi
from scipy.ndimage.measurements import label

def largest_component(indices):
    #this function takes a list of indices denoting
    #the white regions of the image and returns the largest 
    #white object of connected indices 

    return_arr = np.zeros((512,512), dtype=np.uint8)
    for index in indeces:
        return_arr[index[0]][index[1]] = 255

    return return_arr

image = cv2.imread('sonar_dataset/useful/sonarXY_5.bmp', 0)
image_gaussian = ndi.gaussian_filter(image, 4)
image_gaussian_inv = cv2.bitwise_not(image_gaussian)
kernel = np.ones((3,3),np.uint8)

# double thresholding extracting the sides of the rectangle
ret1, img1 = cv2.threshold(image_gaussian_inv, 120, 255, cv2.THRESH_BINARY)
ret2, img2 = cv2.threshold(image_gaussian_inv, 150, 255, cv2.THRESH_BINARY)

double_threshold = img1 - img2
closing = cv2.morphologyEx(double_threshold, cv2.MORPH_CLOSE, kernel1)

labeled, ncomponents = label(closing, kernel)
indices = np.indices(closing.shape).T[:,:,[1, 0]]
twos = indices[labeled == 2]
area =[np.sum(labeled==val) for val in range(ncomponents+1)]

rectangle = largest_component(twos)

cv2.imshow('rectangle', rectangle)
cv2.waitKey(0)

The original image and extracted object are below.

Original Image

Extracted Object


Solution

  • So here's what I came up with - it's a bit labour intensive but it does get us to the right answer eventually. I will be directly using the connected components output that you have shown with the last image.

    1. Use morphological image skeletonization so that we get the skeleton of the blob. This way, it'll give us the most minimal contour representation such that we get a one pixel wide boundary that goes through the middle of each thick edge. You can achieve this through Scikit-image's skeletonize method.

    2. Use the Hough Transform which is a line detection method on the skeletonized image. In summary, it parameterizes lines in the polar domain and the output would be a set of rho and theta that tell us which lines are detected in the skeletonized image. We can use OpenCV's cv2.HoughLines for that. It is very important that you do this on the skeletonized image or we will have a lot of candidate lines parallel to where the true delineation of the bounding box is and you wouldn't be able to distinguish between them.

    3. Take each pair of lines and find their point of intersection. We'd expect that with all pairs of lines, there will be 4 predominant clusters of intersections that give us the corner of each rectangle.

    4. Because of the noise in the contours, we may get more than four points of intersection. We can use the convex hull to finally get 4 points of intersection for the rectangle. In summary, the convex hull algorithm operates on a list of points where it defines a subset of points that can minimally encompass the list of points. We can use cv2.convexHull.

    5. Finally, due to the quantization of the Hough Transform, there may be multiple points that are within the vicinity of each corner. Therefore, apply K-Means clustering to find 4 clusters of points and thus find their centroids. We can use cv2.kmeans for that.

    6. Once we find the centroids, we can simply iterate through each pair of points in a cyclical fashion to finally find the distances to each corner and thus find the distances you care about.

    Let's step through each point one by one:

    Step #1 - Morphological Image Skeletonization

    Using Scikit-image's skeletonize, we can skeletonize the connected components image that you've shown above. Take note that you need to convert the image to binary before you proceed. Once you call the method, we'll need to convert back to unsigned 8-bit integer after for the rest of the process. I've downloaded the image above and saved it locally. We can run the skeletonize method after:

    from skimage.morphology import skeletonize
    
    im = cv2.imread('K7ELI.png', 0)
    out = skeletonize(im > 0)
    
    # Convert to uint8
    out = 255*(out.astype(np.uint8))
    

    We get this image:

    fig1

    Step #2 - Use the Hough Transform

    Using the Hough Transform, we can detect the most prominent lines in this image:

    lines = cv2.HoughLines(out,1,np.pi/180,60)
    

    Here we specify the search space so that we look for lines where the bin size has a length of 1 and the angles have a bin of 1 degree, or pi / 180 radians. In summary, the Hough Transform looks at each edge point and iterates through a range of angles theta that are subtended from the origin to each edge point and calculate the corresponding value of rho respecting the bin size. This pair gets logged into a 2D histogram and we register a vote. We threshold this 2D histogram so that any bins beyond a certain value are line candidates. In the above line of code, set the threshold for the bin counts to be 60.

    This code is optional, but I wanted to show you what the visualized lines look like:

    img_colour = np.dstack([im, im, im])
    lines = cv2.HoughLines(edges,1,np.pi/180,60)
    for rho,theta in lines[:,0]:
        a = np.cos(theta)
        b = np.sin(theta)
        x0 = a*rho
        y0 = b*rho
        x1 = int(x0 + 1000*(-b))
        y1 = int(y0 + 1000*(a))
        x2 = int(x0 - 1000*(-b))
        y2 = int(y0 - 1000*(a))
        cv2.line(img_colour,(x1,y1),(x2,y2),(0,0,255),2)
    

    This code I pulled from the following tutorial. It draws the Hough Transform detected lines in the image as red. I get the following image:

    fig2

    As we can see, there are four points of intersection in the image. It's our job next to find these points of intersection.

    Step #3 - Find points of intersection

    In the Hough Transform, we can relate the length of the line from the origin to a point (x, y) in the image subtended at the angle theta by:

    rho = x*cos(theta) + y*sin(theta)
    

    We can also form the equation of the line y = m*x + c in Cartesian form. We can transform between the two by dividing both sides of the rho equation by sin(theta) then moving the relevant terms to each side:

    Therefore, we should cycle through all unique pairs of lines and using the above equation, we can find their point of intersection by setting their Cartesian forms to be equal to each other. This I won't derive for you for the interest of saving space, but simply set two lines in Cartesian form equal to each other and solve for the x coordinate of intersection. Once that's done, substitute this point into any of the two lines to find the y coordinate. We should obviously skip intersection points that go outside of the image in the case of two nearly parallel lines or if we choose two pairs of lines that go in the same direction and don't intersect.

    pts = []
    for i in range(lines.shape[0]):
        (rho1, theta1) = lines[i,0]
        m1 = -1/np.tan(theta1)
        c1 = rho1 / np.sin(theta1)
        for j in range(i+1,lines.shape[0]):
            (rho2, theta2) = lines[j,0]
            m2 = -1 / np.tan(theta2)
            c2 = rho2 / np.sin(theta2)
            if np.abs(m1 - m2) <= 1e-8:
                continue
            x = (c2 - c1) / (m1 - m2)
            y = m1*x + c1
            if 0 <= x < img.shape[1] and 0 <= y < img.shape[0]:
                pts.append((int(x), int(y)))
    

    pts is a list of tuples such that we add all points of intersection that are within the image that are not out of bounds.

    Step #4 - Use the Convex Hull

    We can use this list of tuples and use the convex hull so that we find a list of points that define the outer perimeter of the rectangle. Take note that the order of points defining the rectangle are counter-clockwise. This doesn't matter for this step but it will matter later:

    pts = np.array(pts)
    pts = pts[:,None] # We need to convert to a 3D numpy array with a singleton 2nd dimension
    hull = cv2.convexHull(pts)
    

    hull contains a 3D NumPy array that is a subset of the original points of intersection that create the outer boundary of the image. We can use these points to draw where these are located in the image for illustration

    out2 = np.dstack([im, im, im])
    for pt in hull[:,0]:
        cv2.circle(out2, tuple(pt), 2, (0, 255, 0), 2)
    

    I've taken the original image and drawn the corner points in green. We get this image:

    fig3

    Step #5 - Apply K-Means clustering

    As you can see in the image above, there are multiple points that map to each corner. It would be good if we can consolidate the multiple points at each corner into a single point. One way is to average all of the points in each corner and the easiest way to do that out of box is to use K-Means clustering. We need the centroids to thus give us the final corner points of the rectangle. We need to make sure we specify 4 clusters to find.

    From the K-Means clustering tutorial from the OpenCV docs, we can use this code:

    # Define criteria = ( type, max_iter = 10 , epsilon = 1.0 )
    criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0)
    
    # Set flags (Just to avoid line break in the code)
    flags = cv2.KMEANS_RANDOM_CENTERS
    
    # Apply KMeans
    # The convex hull points need to be float32
    z = hull.copy().astype(np.float32)
    compactness,labels,centers = cv2.kmeans(z,4,None,criteria,10,flags)
    

    The first parameter is the convex hull of points that need to be in float32 as required by the algorithm. The second parameter specifies the number of clusters we want to search for, so 4 in our case. The third parameter you can skip. It is a placeholder for the best cluster ID each point is assigned to but we don't need to use it. criteria are the K-Means parameters used for the mechanics of the algorithm, and the fifth parameter tells us how many attempts we should run to find the best clusters. We choose 10, meaning that we run K-Means 10 times and choose the clustering configuration that has the least amount of error. The error is stored in the compactness variable that is output from the algorithm. Finally, the last variable are optional flags and we set this so that the initial centroids of the algorithm are simply selected at random from the points.

    labels provides which cluster ID is assigned to each point and centers is the key variable we need which thus returns:

    array([[338.5   , 152.5   ],
           [302.6667, 368.6667],
           [139.    , 340.    ],
           [178.5   , 127.    ]], dtype=float32)
    

    These are the four corner points of the rectangle. We can see where these line up by drawing them directly on the original image, and we also get this image:

    out3 = np.dstack([im, im, im])
    for pt in centers:
        cv2.circle(out3, tuple(pt), 2, (0, 255, 0), 2)
    

    fig5

    Step #6 - Measure the lengths now

    Finally, we can cycle through each pair of lines and find the corresponding dimensions. Take note that because K-Means has the centroids in random order due to the random nature of the algorithm, we can run the convex hull on these centroids to ensure that the order is circular.

    centers = cv2.convexHull(centers)[:,0]
    for (i, j) in zip(range(4), [1, 2, 3, 0]):
        length = np.sqrt(np.sum((centers[i] - centers[j])**2.0))
        print('Length of side {}: {}'.format(i+1, length))
    

    We thus get:

    Length of side 1: 219.11654663085938
    Length of side 2: 166.1582489013672
    Length of side 3: 216.63160705566406
    Length of side 4: 162.019287109375
    

    If you want perspective to see how the bounding box lines up, let's actually draw these lines on the image that are defined at these centres:

    out4 = np.dstack([im, im, im])
    for (i, j) in zip(range(4), [1, 2, 3, 0]):
        cv2.line(out4, tuple(centers[i]), tuple(centers[j]), (0, 0, 255), 2)
    

    We get:

    fig5

    To see where this lines up with the original image, let's just repeat the code above but drawing the lines on the original image. I downloaded a copy of the original image to do so:

    out5 = cv2.imread('no8BP.png') # Note - grayscale image read in as colour
    for (i, j) in zip(range(4), [1, 2, 3, 0]):
        cv2.line(out5, tuple(centers[i]), tuple(centers[j]), (0, 0, 255), 2)
    

    fig6


    For the sake of completeness, here's the entire code from start to finish without all of the debug outputs - we go from reading the image to drawing the lines in the original image with printing the lengths of each side in the detected rectangle.

    from skimage.morphology import skeletonize
    import cv2
    import numpy as np
    
    # Step #1 - Skeletonize
    im = cv2.imread('K7ELI.png', 0)
    out = skeletonize(im > 0)
    
    # Convert to uint8
    out = 255*(out.astype(np.uint8))
    
    # Step #2 - Hough Transform
    lines = cv2.HoughLines(out,1,np.pi/180,60)
    
    # Step #3 - Find points of intersection
    pts = []
    for i in range(lines.shape[0]):
        (rho1, theta1) = lines[i,0]
        m1 = -1/np.tan(theta1)
        c1 = rho1 / np.sin(theta1)
        for j in range(i+1,lines.shape[0]):
            (rho2, theta2) = lines[j,0]
            m2 = -1 / np.tan(theta2)
            c2 = rho2 / np.sin(theta2)
            if np.abs(m1 - m2) <= 1e-8:
                continue
            x = (c2 - c1) / (m1 - m2)
            y = m1*x + c1
            if 0 <= x < img.shape[1] and 0 <= y < img.shape[0]:
                pts.append((int(x), int(y)))
    
    # Step #4 - Find convex hull
    pts = np.array(pts)
    pts = pts[:,None] # We need to convert to a 3D numpy array with a singleton 2nd dimension
    hull = cv2.convexHull(pts)
    
    # Step #5 - K-Means clustering
    # Define criteria = ( type, max_iter = 10 , epsilon = 1.0 )
    criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0)
    
    # Set flags (Just to avoid line break in the code)
    flags = cv2.KMEANS_RANDOM_CENTERS
    
    # Apply KMeans
    # The convex hull points need to be float32
    z = hull.copy().astype(np.float32)
    compactness,labels,centers = cv2.kmeans(z,4,None,criteria,10,flags)
    
    # Step #6 - Find the lengths of each side
    centers = cv2.convexHull(centers)[:,0]
    for (i, j) in zip(range(4), [1, 2, 3, 0]):
        length = np.sqrt(np.sum((centers[i] - centers[j])**2.0))
        print('Length of side {}: {}'.format(i+1, length))
    
    # Draw the sides of each rectangle in the original image
    out5 = cv2.imread('no8BP.png') # Note - grayscale image read in as colour
    for (i, j) in zip(range(4), [1, 2, 3, 0]):
        cv2.line(out5, tuple(centers[i]), tuple(centers[j]), (0, 0, 255), 2)
    
    # Show the image
    cv2.imshow('Output', out5); cv2.waitKey(0); cv2.destroyAllWindows()