pythonopencvimage-processingcorner-detection

Detect corners of XKCD comic panels


I'm trying to detect the panels of xkcd comics so that I can cut them out. I want to retrieve random XKCD comics (implemented) and reliably know where the corners of panels are. My ideal output would be something like this: ![enter image description here

Where I have some or all of the corners of panels. I have an array containing the entire comic, and my approach so far has been to find the large contours, as so:

    contours, hierarchy = cv2.findContours(img, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
    conts=[]
    if len(contours) > 1:
        conts = sorted(contours, key=cv2.contourArea, reverse=True)[0]

And then detect corners using cv2.approxPolyDP. It seems my contouring is not sufficient however, as I usually don't get nice corners, and when I draw the contours, they don't seem that good.

A good run:

Good run (some dots fall onto corners)

A bad one:

bad run (girl is covered in dots)

I'm wondering if I'm doing something wrong with my contours, like maybe not choosing the right mode (I chose RETR_EXTERNAL because the comic panels are fairly external).

The comics are normally grayscale but sometimes not, so I'm converting to grayscale and using a binary threshold prior to contour detection.

Some other thoughts:

  1. Would edge detection maybe be more suitable for this task?

  2. I've also noticed that the array I have has the border right at the edge of the picture, so would padding help?

  3. The comics are somewhat variable so would a brute force method maybe work better?


Solution

  • My hunch is the contour area will tell you something about how complex the path can be. I'd go a simpler metric such as the height of the contour's boundingRect():

    #!/usr/bin/env python
    import cv2
    import numpy as np
    
    filename = 'boyfriend.png'
    img = cv2.imread(filename)
    gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
    
    ret, dst = cv2.threshold(gray, 127 ,255,0)
    # erode and negate to amplify edges
    dst = cv2.erode(dst,None,iterations=2)
    dst = (255-dst)
    
    cv2.imshow('thresh', dst)
    
    contours, hierarchy = cv2.findContours(dst, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    
    if len(contours) > 1:
        # print bounding box height per contour for debugging purposes
        print([cv2.boundingRect(cnt)[3] for cnt in contours])
        # filter contours where the bounding box height matches the image height
        conts = [cnt for cnt in contours if cv2.boundingRect(cnt)[3] == img.shape[0]]
        # preview
        img = cv2.drawContours(img, conts, -1, (0,255,0), 3)
    
    cv2.imshow('img',img)
    cv2.waitKey(0)
    

    Notice 3 contour bounding box heights stand out:

    [127, 16, 16, 16, 16, 35, 15, 36, 16, 16, 16, 18, 17, 17, 220, 220, 220]
    

    You may want to optionally change the condition to a threshold instead of an exact value. e.g.

    # if the contour bounding box height is > 3/4 of the image height
    cv2.boundingRect(cnt)[3] > img.shape[0] * 0.75
    

    the full code above produces: xkcd panel detection

    Notice that I'm using a morphological filter to amplify the edges. It works for that image with those iterations as the filter expands the box outlines enough but not so much that they merge with the text / characters. This maybe something to tweak for other images.

    Update Doing a quick search I found a couple of potentially interesting resources:

    Kumiko

    Kumiko, the Comics Cutter is a set of tools to compute useful information about comic book pages, panels, and more. Its main strength is to find out the locations of panels within a comic's page (image file). Kumiko can also compile information about panels for all pages in a comic book, and present it as one piece of data (JSON-formatted object). kumiko xkcd panel detection

    Segmentation and indexation of complex objects in comic book images (different from what you're asking, however potentially useful for next steps)