pythonopencvimage-processingimage-thresholdingdatamatrix

Use data matrix as a fiducial to obtain angle of rotation


I have a bunch of images such as the one above. They each contain a data matrix, but do not guarantee that it is oriented to an axis. Nevertheless, I can read these matrices with libdmtx pretty reliably regardless of their rotation. However, I also need to rotate the image so that the label is oriented right-side-up. My thought process is that I need to get the angle of rotation of the data matrix so that I can rotate the image with PIL to orient it correctly. pylibdmtx.decode returns the data that the matrix contains, as well as a rectangle which I originally thought was the bounding box of the data matrix. To test this, I ran the following code with the image above:

from PIL import Image
from pylibdmtx.pylibdmtx import decode

def segment_qr_code(image: Image.Image):
    data = decode(image)[0]
    print(data.rect)

if __name__ == "__main__":
    segment_qr_code(Image.open('<path to image>'))

Unfortunately, this code returned Rect(left=208, top=112, width=94, height=-9). Because the height is negative, I don't think it is the bounding box to the data matrix, and if it is, I don't know how to use it to get the angle of rotation.

My question is, what is the best way to obtain the angle of rotation of the data matrix? I originally thought that I could crop the image with the bounding box to get a segmented image of just the data matrix. Then I could use image thresholding or contouring to get an angle of rotation. However, I'm not sure how to get the correct bounding box, and even if I did I don't know how to use thresholding. I would also prefer to not use thresholding because it isn't always accurate. The data matrix always has a solid border on the bottom and left sides, so I think it may be possible to use it as a fiducial to align the image, however I was unable to find any libraries that were able to return the angle of rotation of the data matrix.

I am open to any suggestions. Thanks in advance.


Solution

  • Thank you to @flakes for the suggestion. Combining code from the PR and issue, I created the following solution:

    from pylibdmtx.pylibdmtx import _region, _decoder, _image, _pixel_data, _decoded_matrix_region
    from pylibdmtx.wrapper import c_ubyte_p, DmtxPackOrder, DmtxVector2, dmtxMatrix3VMultiplyBy, DmtxUndefined
    from ctypes import cast, string_at
    from collections import namedtuple
    import numpy
    
    _pack_order = {
        8: DmtxPackOrder.DmtxPack8bppK,
        16: DmtxPackOrder.DmtxPack16bppRGB,
        24: DmtxPackOrder.DmtxPack24bppRGB,
        32: DmtxPackOrder.DmtxPack32bppRGBX,
    }
    Decoded = namedtuple('Decoded', 'data rect')
    
    
    def decode_with_region(image):
        results = []
        pixels, width, height, bpp = _pixel_data(image)
        with _image(cast(pixels, c_ubyte_p), width, height, _pack_order[bpp]) as img:
            with _decoder(img, 1) as decoder:
                while True:
                    with _region(decoder, None) as region:
                        if not region:
                            break
                        else:
                            res = _decode_region(decoder, region)
                            if res:
                                open_cv_image = numpy.array(image)
                                # Convert RGB to BGR
                                open_cv_image = open_cv_image[:, :, ::-1].copy()
                                height, width, _ = open_cv_image.shape
    
                                topLeft = (res.rect['01']['x'], height - res.rect['01']['y'])
                                topRight = (res.rect['11']['x'], height - res.rect['11']['y'])
                                bottomRight = (res.rect['10']['x'], height - res.rect['10']['y'])
                                bottomLeft = (res.rect['00']['x'], height - res.rect['00']['y'])
                                results.append(Decoded(res.data, (topLeft, topRight, bottomRight, bottomLeft)))
        return results
    
    
    def _decode_region(decoder, region):
        with _decoded_matrix_region(decoder, region, DmtxUndefined) as msg:
            if msg:
                vector00 = DmtxVector2()
                vector11 = DmtxVector2(1.0, 1.0)
                vector10 = DmtxVector2(1.0, 0.0)
                vector01 = DmtxVector2(0.0, 1.0)
                dmtxMatrix3VMultiplyBy(vector00, region.contents.fit2raw)
                dmtxMatrix3VMultiplyBy(vector11, region.contents.fit2raw)
                dmtxMatrix3VMultiplyBy(vector01, region.contents.fit2raw)
                dmtxMatrix3VMultiplyBy(vector10, region.contents.fit2raw)
    
                return Decoded(
                    string_at(msg.contents.output),
                    {
                        '00': {
                            'x': int((vector00.X) + 0.5),
                            'y': int((vector00.Y) + 0.5)
                        },
                        '01': {
                            'x': int((vector01.X) + 0.5),
                            'y': int((vector01.Y) + 0.5)
                        },
                        '10': {
                            'x': int((vector10.X) + 0.5),
                            'y': int((vector10.Y) + 0.5)
                        },
                        '11': {
                            'x': int((vector11.X) + 0.5),
                            'y': int((vector11.Y) + 0.5)
                        }
                    }
                )
            else:
                return None
    

    To decode an image, use decode_with_region() instead of pylibdmtx's decode(). It outputs a dictionary of coordinates, which I can plot on an image and get the following output:

    I can then use these coordinates to obtain an angle of rotation:

    def get_data_from_matrix(image):
        decoded = decode_with_region(image)[0]
        topLeft, topRight = decoded.rect[2], decoded.rect[3]
        rotation = -math.atan2(topLeft[1] - topRight[1], topLeft[0] - topRight[0]) * (180 / math.pi)
        image = image.rotate(rotation, expand=True)