opencvimage-processingcomputer-visionbackground-subtractionferet-value

Issues with Detecting Object Borders Due to Color or Transparency in OpenCV and Feret Measurements


-Object Detection: Due to the color or transparency of the items, the borders are not accurately detected using the background subtraction method.

-Measurements: This inaccuracy leads to incorrect Feret diameter measurements.

I'm working on a project where I need to detect and measure items in images. Each image contains a picture of an item, and I'm using the **subtraction **method in OpenCV (cv2) to remove the background and then using the imea library to calculate the maximum and minimum Feret diameters. However, I'm facing challenges with detecting the borders of the items, particularly due to their color or transparency.

Here is the code that I'm using (I'm running this in a Jupyter Notebook):

import cv2 as cv
import os
import numpy as np
import matplotlib.pyplot as plt
# Clean up
plt.close('all')
import os
import numpy as np
from PIL import Image
from tqdm import tqdm
import cv2  # Import OpenCV for image preprocessing
import feret  # Import the feret module

# Function to display images in Jupyter
def display_image(image, title='Image'):
    plt.figure(figsize=(10, 10))
    plt.imshow(image, cmap='gray')
    plt.title(title)
    plt.axis('off')
    plt.show()

# Path to the folder containing images and the background image
image_folder = '.'
background_image_path = 'Main.jpeg'

# Load the background image
background_image = cv.imread(background_image_path, cv.IMREAD_COLOR)
if background_image is None:
    print(f"Unable to load background image: {background_image_path}")
    exit(1)

# Convert background image to grayscale
background_gray = cv.cvtColor(background_image, cv.COLOR_BGR2GRAY)

# Iterate through each file in the folder
for filename in os.listdir(image_folder):
    if filename.endswith('.jpeg') and filename != 'Main.jpeg':
        image_path = os.path.join(image_folder, filename)
        
        # Load the current image
        image = cv.imread(image_path, cv.IMREAD_COLOR)
        if image is None:
            print(f"Unable to load image: {image_path}")
            continue

        # Convert current image to grayscale
        image_gray = cv.cvtColor(image, cv.COLOR_BGR2GRAY)
        
        # Perform background subtraction
        fgMask = cv.absdiff(background_gray, image_gray)
        
        # Apply a threshold to get the binary image
        _, fgMask = cv.threshold(fgMask, 50, 255, cv.THRESH_BINARY)
    
        # Display the original image and the foreground mask
        display_image(cv.cvtColor(image, cv.COLOR_BGR2RGB), title='Original Image')
        display_image(fgMask, title='Foreground Mask')

        # Plot Feret diameters on the preprocessed image
        feret.plot(fgMask)        
        # Calculate Feret diameters and angles
        maxf_length, minf_length, minf_angle, maxf_angle = feret.all(fgMask)
        
        # Print the results for the current image
        print(f"Filename: {filename}, Max Feret Length: {maxf_length}, Min Feret Length: {minf_length}")
        
    
        # result_path = os.path.join('path_to_save_results', f'fgMask_{filename}')
        # cv.imwrite(result_path, fgMask)

        # Pause to control the display in the notebook
        # This can be adjusted or removed as needed
        input("Press Enter to continue...")

# Clean up
plt.close('all')

# Optionally, print all results at the end
# for res in results:
#     print(f"Filename: {res['filename']}, Max Feret Length: {res['maxf_length']}, Min Feret Length: {res['minf_length']}")

Main(Background)

sample

sample

sample

sample

sample

more samples

Current Output

Are there alternative methods or preprocessing steps in OpenCV that could help in better isolating the items from the background? Any advice, suggestions, or guidance on how to tackle these issues would be greatly appreciated.


Solution

  • The first change I would make is taking the maximum difference for each pixel across the channels, rather than converting to grayscale. When converting to grayscale, different colors with a similar intensity will appear similar.

    Next, lower the threshold. You picked 50 to ensure that there's no pixels found outside the object at all. Using a bit of filtering you can reduce the noise levels, and thus also reduce this threshold. Ideally the threshold is chosen such that your object is as solid as possible, that most object pixels are identified. You can then ignore other "objects" that appear because of noise and differences in the background intensity by looking only at the object in the middle of the field of view.

    This is one way of doing so. I'm using DIPlib here, as I know it better than OpenCV, so it's faster for me to experiment (also, I'm an author).

    import diplib as dip
    
    bg = dip.ImageRead('7AqYQsRe.jpg')  # the background image
    img = dip.ImageRead('3Kvw57Jl.jpg')
    
    diff = dip.MaximumTensorElement(dip.Abs(img - bg))  # your fgMask_gray
    diff = dip.MedianFilter(diff)  # a minimal amount of filtering
    objects = diff > 10  # a random value that seems to do quite well
    
    # Pick the object in the center of the field of view, the quick and dirty way:
    objects = dip.Label(objects)
    label = objects[objects.Size(0) // 2, objects.Size(1) // 2]
    mask = objects == label
    
    # Measure
    msr = dip.MeasurementTool.Measure(+mask, features=['Feret'])
    print(msr)
    max_len, min_len, perm_min_len, max_angle, min_angle = msr[1]['Feret']
    

    Note that dip.MedianFilter uses a circular mask with a diameter of 7 pixels by default. I have not tried to adjust this parameter. You could also try better noise filters. All filtering can affect the location of edges, heavier filtering will likely make your object smaller. But some filters are better at preserving the edge location than others. Linear filters will have the biggest impact. Be aware of this!

    The thresholded image actually looks really good. We can use dip.AreaOpening() to remove the smaller dots, leaving only the true object. But I included the logic in the code above because it makes sense and is likely more robust. A better way to keep only the object in the center is to measure the centroid of each object, and picking the one where the centroid is closest to the center of the image:

    import numpy as np
    
    msr = dip.MeasurementTool.Measure(objects, features=['Center', 'Feret'])
    best_dist = 1e7
    best_obj = 0
    center = np.asarray(objects.Sizes()) // 2
    for obj in msr.Objects():
        dist = np.linalg.norm(np.asarray(msr[obj]['Center']) - center)
        if dist < best_dist:
            best_dist = dist
            best_obj = obj
    
    max_len, min_len, perm_min_len, max_angle, min_angle = msr[best_obj]['Feret']
    

    Edit 1: If the object is not always in the center of the image, you can for example look for the biggest object (using DIPlib, it's the 'Size' feature), or come up with some other feature that is meaningful in your application.


    Edit 2: To measure only the largest object:

    import diplib as dip
    
    bg = dip.ImageRead('7AqYQsRe.jpg')  # the background image
    img = dip.ImageRead('3Kvw57Jl.jpg')
    
    diff = dip.MaximumTensorElement(dip.Abs(img - bg))  # your fgMask_gray
    diff = dip.MedianFilter(diff)  # a minimal amount of filtering
    
    objects = diff > 10  # a random value that seems to do quite well
    objects = dip.Label(objects, mode='largest')  # the labeled image will have only one object in it
    msr = dip.MeasurementTool.Measure(objects, features=['Feret'])
    max_len, min_len, perm_min_len, max_angle, min_angle = msr[1]['Feret']