pythonimage-processingcomputer-visionimage-cropper

cropping the image by removing the white spaces


I am trying to identify the empty spaces in the image and if there is no image, then I would like to crop it by eliminating the spaces. Just like in the images below.

Image with spaces --> Image after cropping

I would be grateful for your help. Thanks in advance!

I was using the following code, but was not really working.

import cv2

import numpy as np

def crop_empty_spaces_refined(image_path, threshold_percentage=0.01): image = cv2.imread(image_path, cv2.IMREAD_UNCHANGED)

if image is None:
    print(f"Error: Could not read image at {image_path}")
    return None

if image.shape[2] == 4:  # RGBA image
    gray = image[:, :, 3]  # Use alpha channel
else:
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

_, thresh = cv2.threshold(gray, 240, 255, cv2.THRESH_BINARY_INV)

kernel = np.ones((3, 3), np.uint8)
thresh = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=1)

contours, _ = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

if contours:
    image_area = image.shape[0] * image.shape[1]
    min_contour_area = image_area * threshold_percentage

    valid_contours = [cnt for cnt in contours if cv2.contourArea(cnt) >= min_contour_area]

    if valid_contours:
        # ***Corrected Bounding Box Calculation***
        x_coords = []
        y_coords = []

        for cnt in valid_contours:
            x, y, w, h = cv2.boundingRect(cnt)
            x_coords.extend([x, x + w])  # Add both start and end x
            y_coords.extend([y, y + h])  # Add both start and end y

        x_min = min(x_coords)
        y_min = min(y_coords)
        x_max = max(x_coords)
        y_max = max(y_coords)

        cropped_image = image[y_min:y_max, x_min:x_max]

        return cropped_image
    else:
        print("No valid contours found after filtering. Returning original image.")
        return image
else:
    return image

image_path = '/mnt/data/Untitled.png' # file path cropped_image = crop_empty_spaces_refined(image_path, threshold_percentage=0.0001)

if cropped_image is not None: cv2.imwrite('/mnt/data/cropped_output.png', cropped_image) print("Image Cropped and saved") else: print("Could not crop image")


Solution

  • Approach:

    1. threshold -> obtain mask
    2. use boundingRect() on the mask
    3. crop
    im = cv.imread("Cb768fyr.jpg")
    gray = cv.cvtColor(im, cv.COLOR_BGR2GRAY)
    
    th = 240 # 255 won't do, the image's background isn't perfectly white
    (th, mask) = cv.threshold(gray, th, 255, cv.THRESH_BINARY_INV)
    
    (x, y, w, h) = cv.boundingRect(mask)
    
    pad = 0 # increase to give it some border
    cropped = im[y-pad:y+h+pad, x-pad:x+w+pad]
    

    cropped

    Why threshold at all? Because cv.boundingRect() would otherwise treat all non-zero pixels as "true", i.e. the background would be considered foreground.

    Why threshold with something other than 255? The background isn't perfectly white, due to the source image having been compressed lossily. If you did, that would be the result:

    bad mask, level 255


    If you wanted to replace cv.boundingRect(), you can do it like this:

    1. max-reduce mask along each axis in turn
    2. find first and last index of positive values
    xproj = np.max(mask, axis=1) # collapse X, have Y
    ymin = np.argmax(xproj)
    ymax = len(xproj) - np.argmax(xproj[::-1])
    print(f"{ymin=}, {ymax=}")
    
    yproj = np.max(mask, axis=0)
    xmin = np.argmax(yproj)
    xmax = len(yproj) - np.argmax(yproj[::-1])
    print(f"{xmin=}, {xmax=}")
    
    cropped = im[ymin-pad:ymax+pad, xmin-pad:xmax+pad]
    

    This could also use np.argwhere(). I won't bother comparing these two approaches since cv.boundingRect() does the job already.


    The findContours approach will pick any connected component, not all of them. This means it could sometimes pick the triad (bottom left) or text (top left), and entirely discard most of the image.

    You could fix that by slapping a convex hull on all the contours, but you'd still have to call boundingRect() anyway. So, all the contour stuff is wasted effort.