I have an image I am attempting to split into its separate components, I have successfully created a mask of the objects in the image using k-means clustering. (I have included the results and mask below)
I am then trying to crop each individual part of the original image and save it to a new image, is this possible?
import numpy as np
import cv2
path = 'software (1).jpg'
img = cv2.imread(path)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
twoDimage = img.reshape((-1,3))
twoDimage = np.float32(twoDimage)
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0)
K = 2
attempts=10
ret,label,center = cv2.kmeans(twoDimage,K,None,criteria,attempts,cv2.KMEANS_PP_CENTERS)
center = np.uint8(center)
res = center[label.flatten()]
result_image = res.reshape((img.shape))
cv2.imwrite('result.jpg',result_image)
My solution involves creating a binary object mask where all the objects are colored in white and the background in black. I then extract each object based on area, from largest to smallest. I use this "isolated object" mask to segment each object in the original image. I then write the result to disk. These are the steps:
Let's see the code. Through the script I use two helper functions: writeImage
and findBiggestBlob
. The first function is pretty self-explanatory. The second function creates a binary mask of the biggest blob in a binary input image. Both functions are presented here:
# Writes a PNG image:
def writeImage(imagePath, inputImage):
imagePath = imagePath + ".png"
cv2.imwrite(imagePath, inputImage, [cv2.IMWRITE_PNG_COMPRESSION, 0])
print("Wrote Image: " + imagePath)
def findBiggestBlob(inputImage):
# Store a copy of the input image:
biggestBlob = inputImage.copy()
# Set initial values for the largest contour:
largestArea = 0
largestContourIndex = 0
# Find the contours on the binary image:
contours, hierarchy = cv2.findContours(inputImage, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
# Get the largest contour in the contours list:
for i, cc in enumerate(contours):
# Find the area of the contour:
area = cv2.contourArea(cc)
# Store the index of the largest contour:
if area > largestArea:
largestArea = area
largestContourIndex = i
# Once we get the biggest blob, paint it black:
tempMat = inputImage.copy()
cv2.drawContours(tempMat, contours, largestContourIndex, (0, 0, 0), -1, 8, hierarchy)
# Erase smaller blobs:
biggestBlob = biggestBlob - tempMat
return biggestBlob
Now, let's check out the main script. Let's read the image and get the initial binary mask:
# Imports
import cv2
import numpy as np
# Read image
imagePath = "D://opencvImages//"
inputImage = cv2.imread(imagePath + "L85Bu.jpg")
# Get image dimensions
originalImageHeight, originalImageWidth = inputImage.shape[:2]
# Resize at a fixed scale:
resizePercent = 30
resizedWidth = int(originalImageWidth * resizePercent / 100)
resizedHeight = int(originalImageHeight * resizePercent / 100)
# Resize image
inputImage = cv2.resize(inputImage, (resizedWidth, resizedHeight), interpolation=cv2.INTER_LINEAR)
writeImage(imagePath+"objectInput", inputImage)
# Deep BGR copy:
colorCopy = inputImage.copy()
# Convert BGR to grayscale:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
# Threshold via Otsu:
_, binaryImage = cv2.threshold(grayscaleImage, 250, 255, cv2.THRESH_BINARY_INV)
This is the input resized by 30%
according to resizePercent
:
And this is the binary mask created with a fixed threshold
of 250
:
Now, I'm gonna run this mask through a while
loop. With each iteration I'll extract the biggest blob until there are no blobs left. Each step will create a new binary mask where the only thing present is one object at a time. This will be the key to isolating the objects in the original (resized) BGR
image:
# Image counter to write pngs to disk:
imageCounter = 0
# Segmentation flag to stop the processing loop:
segmentObjects = True
while (segmentObjects):
# Get biggest object on the mask:
currentBiggest = findBiggestBlob(binaryImage)
# Use a little bit of morphology to "widen" the mask:
kernelSize = 3
opIterations = 2
morphKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
# Perform Dilate:
binaryMask = cv2.morphologyEx(currentBiggest, cv2.MORPH_DILATE, morphKernel, None, None, opIterations,cv2.BORDER_REFLECT101)
# Mask the original BGR (resized) image:
blobMask = cv2.bitwise_and(colorCopy, colorCopy, mask=binaryMask)
# Flood-fill at the top left corner:
fillPosition = (0, 0)
# Use white color:
fillColor = (255, 255, 255)
colorTolerance = (0,0,0)
cv2.floodFill(blobMask, None, fillPosition, fillColor, colorTolerance, colorTolerance)
# Write file to disk:
writeImage(imagePath+"object-"+str(imageCounter), blobMask)
imageCounter+=1
# Subtract current biggest blob to
# original binary mask:
binaryImage = binaryImage - currentBiggest
# Check for stop condition - all pixels
# in the binary mask should be black:
whitePixels = cv2.countNonZero(binaryImage)
# Compare against a threshold - 10% of
# resized dimensions:
whitePixelThreshold = 0.01 * (resizedWidth * resizedHeight)
if (whitePixels < whitePixelThreshold):
segmentObjects = False
There are some things worth noting here. This is the first isolated mask created for the first object:
Nice. A simple mask with the BGR
image will do. However, I can improve the quality of the mask if I apply a dilate
morphological operation. This will "widen" the blob, covering the original outline by a few pixels. (The operation actually searches for the maximum intensity pixel within a Neighborhood of pixels). Next, the masking will produce a BGR
image where there's only the object blob and a black background. I don't want that black background, I want it white. I flood-fill
at the top left corner to get the first BGR
mask:
I save each mask to a new file on disk. Very cool. Now, the condition to break from the loop is pretty simple - stop when all the blobs have been processed. To achieve this I subtract the current biggest blob to the original binary white and count the number of white pixels. When the count is below a certain threshold (in this case 10%
of the resized image) stop the loop.
Check out this gif
of every object isolated. Each frame is saved to disk as a png
file: