pythonopencv

Splitting image by whitespace


I have an image I am attempting to split into its separate components, I have successfully created a mask of the objects in the image using k-means clustering. (I have included the results and mask below)

I am then trying to crop each individual part of the original image and save it to a new image, is this possible?

import numpy as np
import cv2

path = 'software (1).jpg'
img = cv2.imread(path)

img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
twoDimage = img.reshape((-1,3))
twoDimage = np.float32(twoDimage)

criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0)
K = 2
attempts=10

ret,label,center = cv2.kmeans(twoDimage,K,None,criteria,attempts,cv2.KMEANS_PP_CENTERS)
center = np.uint8(center)
res = center[label.flatten()]
result_image = res.reshape((img.shape))


cv2.imwrite('result.jpg',result_image)

Original image

Result of k-means


Solution

  • My solution involves creating a binary object mask where all the objects are colored in white and the background in black. I then extract each object based on area, from largest to smallest. I use this "isolated object" mask to segment each object in the original image. I then write the result to disk. These are the steps:

    1. Resize the image (your original input is gigantic)
    2. Convert to grayscale
    3. Extract each object based on area from largest to smallest
    4. Create a binary mask of the isolated object
    5. Apply a little bit of morphology to enhance the mask
    6. Mask the original BGR image with the binary mask
    7. Apply flood-fill to color the background with white
    8. Save image to disk
    9. Repeat the process for all the objects in the image

    Let's see the code. Through the script I use two helper functions: writeImage and findBiggestBlob. The first function is pretty self-explanatory. The second function creates a binary mask of the biggest blob in a binary input image. Both functions are presented here:

    # Writes a PNG image:
    def writeImage(imagePath, inputImage):
        imagePath = imagePath + ".png"
        cv2.imwrite(imagePath, inputImage, [cv2.IMWRITE_PNG_COMPRESSION, 0])
        print("Wrote Image: " + imagePath)
    
    
    def findBiggestBlob(inputImage):
        # Store a copy of the input image:
        biggestBlob = inputImage.copy()
        # Set initial values for the largest contour:
        largestArea = 0
        largestContourIndex = 0
    
        # Find the contours on the binary image:
        contours, hierarchy = cv2.findContours(inputImage, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
    
        # Get the largest contour in the contours list:
        for i, cc in enumerate(contours):
            # Find the area of the contour:
            area = cv2.contourArea(cc)
            # Store the index of the largest contour:
            if area > largestArea:
                largestArea = area
                largestContourIndex = i
    
        # Once we get the biggest blob, paint it black:
        tempMat = inputImage.copy()
        cv2.drawContours(tempMat, contours, largestContourIndex, (0, 0, 0), -1, 8, hierarchy)
        # Erase smaller blobs:
        biggestBlob = biggestBlob - tempMat
    
        return biggestBlob
    

    Now, let's check out the main script. Let's read the image and get the initial binary mask:

    # Imports
    import cv2
    import numpy as np
    
    # Read image
    imagePath = "D://opencvImages//"
    inputImage = cv2.imread(imagePath + "L85Bu.jpg")
    
    # Get image dimensions
    originalImageHeight, originalImageWidth = inputImage.shape[:2]
    
    # Resize at a fixed scale:
    resizePercent = 30
    resizedWidth = int(originalImageWidth * resizePercent / 100)
    resizedHeight = int(originalImageHeight * resizePercent / 100)
    
    # Resize image
    inputImage = cv2.resize(inputImage, (resizedWidth, resizedHeight), interpolation=cv2.INTER_LINEAR)
    writeImage(imagePath+"objectInput", inputImage)
    
    # Deep BGR copy:
    colorCopy = inputImage.copy()
    
    # Convert BGR to grayscale:
    grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
    
    # Threshold via Otsu:
    _, binaryImage = cv2.threshold(grayscaleImage, 250, 255, cv2.THRESH_BINARY_INV)
    

    This is the input resized by 30% according to resizePercent:

    And this is the binary mask created with a fixed threshold of 250:

    Now, I'm gonna run this mask through a while loop. With each iteration I'll extract the biggest blob until there are no blobs left. Each step will create a new binary mask where the only thing present is one object at a time. This will be the key to isolating the objects in the original (resized) BGR image:

    # Image counter to write pngs to disk:
    imageCounter = 0
    
    # Segmentation flag to stop the processing loop:
    segmentObjects = True
    
    while (segmentObjects):
    
        # Get biggest object on the mask:
        currentBiggest = findBiggestBlob(binaryImage)
    
        # Use a little bit of morphology to "widen" the mask:
        kernelSize = 3
        opIterations = 2
        morphKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
        # Perform Dilate:
        binaryMask = cv2.morphologyEx(currentBiggest, cv2.MORPH_DILATE, morphKernel, None, None, opIterations,cv2.BORDER_REFLECT101)
    
        # Mask the original BGR (resized) image:
        blobMask = cv2.bitwise_and(colorCopy, colorCopy, mask=binaryMask)
    
        # Flood-fill at the top left corner:
        fillPosition = (0, 0)
        # Use white color:
        fillColor = (255, 255, 255)
        colorTolerance = (0,0,0)
        cv2.floodFill(blobMask, None, fillPosition, fillColor, colorTolerance, colorTolerance)
    
        # Write file to disk:
        writeImage(imagePath+"object-"+str(imageCounter), blobMask)
        imageCounter+=1
    
        # Subtract current biggest blob to
        # original binary mask:
        binaryImage = binaryImage - currentBiggest
    
        # Check for stop condition - all pixels
        # in the binary mask should be black:
        whitePixels = cv2.countNonZero(binaryImage)
    
        # Compare against a threshold - 10% of
        # resized dimensions:
        whitePixelThreshold = 0.01 * (resizedWidth * resizedHeight)
        if (whitePixels < whitePixelThreshold):
            segmentObjects = False
    

    There are some things worth noting here. This is the first isolated mask created for the first object:

    Nice. A simple mask with the BGR image will do. However, I can improve the quality of the mask if I apply a dilate morphological operation. This will "widen" the blob, covering the original outline by a few pixels. (The operation actually searches for the maximum intensity pixel within a Neighborhood of pixels). Next, the masking will produce a BGR image where there's only the object blob and a black background. I don't want that black background, I want it white. I flood-fill at the top left corner to get the first BGR mask:

    I save each mask to a new file on disk. Very cool. Now, the condition to break from the loop is pretty simple - stop when all the blobs have been processed. To achieve this I subtract the current biggest blob to the original binary white and count the number of white pixels. When the count is below a certain threshold (in this case 10% of the resized image) stop the loop.

    Check out this gif of every object isolated. Each frame is saved to disk as a png file: