Given a sprite sheet like this:
I would like to write an algorithm that can loop through the pixel data and determine the bounding rectangle of each discreet sprite.
If we assume that for each pixel X, Y that I can pull either true (pixel is not totally transparent) or false (pixel is totally transparent), how would I go about automatically generating the bounding rectangles for each sprite?
The resulting data should be an array of rectangle objects with {x, y, width, height}.
Here's the same image but with the bounds of the first four sprites marked in light blue:
Can anyone give a step-by-step on how to detect these bounds as described above?
Here's an approach
After converting to grayscale, we Otsu's threshold to obtain a binary image
Next we perform morphological transformations to merge each sprite into a single contour
From here we find contours, iterate through each contour, draw the bounding rectangle, and extract each ROI. Here's the result
and here's each saved sprite ROI
I've implemented this method using OpenCV and Python but you can adapt the strategy to any language
import cv2
image = cv2.imread('1.jpg')
original = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
close = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel, iterations=2)
dilate = cv2.dilate(close, kernel, iterations=1)
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
sprite_number = 0
for c in cnts:
x,y,w,h = cv2.boundingRect(c)
ROI = image[y:y+h, x:x+w]
cv2.imwrite('sprite_{}.png'.format(sprite_number), ROI)
cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 2)
sprite_number += 1
cv2.imshow('thresh', thresh)
cv2.imshow('dilate', dilate)
cv2.imshow('image', image)
cv2.waitKey()