I'm trying to detect the panels of xkcd comics so that I can cut them out. I want to retrieve random XKCD comics (implemented) and reliably know where the corners of panels are. My ideal output would be something like this:
Where I have some or all of the corners of panels. I have an array containing the entire comic, and my approach so far has been to find the large contours, as so:
contours, hierarchy = cv2.findContours(img, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
conts=[]
if len(contours) > 1:
conts = sorted(contours, key=cv2.contourArea, reverse=True)[0]
And then detect corners using cv2.approxPolyDP. It seems my contouring is not sufficient however, as I usually don't get nice corners, and when I draw the contours, they don't seem that good.
A good run:
A bad one:
I'm wondering if I'm doing something wrong with my contours, like maybe not choosing the right mode (I chose RETR_EXTERNAL because the comic panels are fairly external).
The comics are normally grayscale but sometimes not, so I'm converting to grayscale and using a binary threshold prior to contour detection.
Some other thoughts:
Would edge detection maybe be more suitable for this task?
I've also noticed that the array I have has the border right at the edge of the picture, so would padding help?
The comics are somewhat variable so would a brute force method maybe work better?
My hunch is the contour area will tell you something about how complex the path can be.
I'd go a simpler metric such as the height of the contour's boundingRect()
:
#!/usr/bin/env python
import cv2
import numpy as np
filename = 'boyfriend.png'
img = cv2.imread(filename)
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
ret, dst = cv2.threshold(gray, 127 ,255,0)
# erode and negate to amplify edges
dst = cv2.erode(dst,None,iterations=2)
dst = (255-dst)
cv2.imshow('thresh', dst)
contours, hierarchy = cv2.findContours(dst, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
if len(contours) > 1:
# print bounding box height per contour for debugging purposes
print([cv2.boundingRect(cnt)[3] for cnt in contours])
# filter contours where the bounding box height matches the image height
conts = [cnt for cnt in contours if cv2.boundingRect(cnt)[3] == img.shape[0]]
# preview
img = cv2.drawContours(img, conts, -1, (0,255,0), 3)
cv2.imshow('img',img)
cv2.waitKey(0)
Notice 3 contour bounding box heights stand out:
[127, 16, 16, 16, 16, 35, 15, 36, 16, 16, 16, 18, 17, 17, 220, 220, 220]
You may want to optionally change the condition to a threshold instead of an exact value. e.g.
# if the contour bounding box height is > 3/4 of the image height
cv2.boundingRect(cnt)[3] > img.shape[0] * 0.75
Notice that I'm using a morphological filter to amplify the edges. It works for that image with those iterations as the filter expands the box outlines enough but not so much that they merge with the text / characters. This maybe something to tweak for other images.
Update Doing a quick search I found a couple of potentially interesting resources:
Kumiko, the Comics Cutter is a set of tools to compute useful information about comic book pages, panels, and more. Its main strength is to find out the locations of panels within a comic's page (image file). Kumiko can also compile information about panels for all pages in a comic book, and present it as one piece of data (JSON-formatted object).
Segmentation and indexation of complex objects in comic book images (different from what you're asking, however potentially useful for next steps)