pythonpython-imaging-libraryperspective

How does perspective transformation work in PIL?


PIL's Image.transform has a perspective-mode which requires an 8-tuple of data but I can't figure out how to convert let's say a right tilt of 30 degrees to that tuple.

Can anyone explain it?


Solution

  • To apply a perspective transformation you first have to know four points in a plane A that will be mapped to four points in a plane B. With those points, you can derive the homographic transform. By doing this, you obtain your 8 coefficients and the transformation can take place.

    The site http://xenia.media.mit.edu/~cwren/interpolator/ (mirror: WebArchive), as well as many other texts, describes how those coefficients can be determined. To make things easy, here is a direct implementation according from the mentioned link:

    import numpy
    
    def find_coeffs(pa, pb):
        matrix = []
        for p1, p2 in zip(pa, pb):
            matrix.append([p1[0], p1[1], 1, 0, 0, 0, -p2[0]*p1[0], -p2[0]*p1[1]])
            matrix.append([0, 0, 0, p1[0], p1[1], 1, -p2[1]*p1[0], -p2[1]*p1[1]])
    
        A = numpy.matrix(matrix, dtype=numpy.float)
        B = numpy.array(pb).reshape(8)
    
        res = numpy.dot(numpy.linalg.inv(A.T * A) * A.T, B)
        return numpy.array(res).reshape(8)
    

    where pb is the four vertices in the current plane, and pa contains four vertices in the resulting plane.

    So, suppose we transform an image as in:

    import sys
    from PIL import Image
    
    img = Image.open(sys.argv[1])
    width, height = img.size
    m = -0.5
    xshift = abs(m) * width
    new_width = width + int(round(xshift))
    img = img.transform((new_width, height), Image.AFFINE,
            (1, m, -xshift if m > 0 else 0, 0, 1, 0), Image.BICUBIC)
    img.save(sys.argv[2])
    

    Here is a sample input and output with the code above:

    enter image description here enter image description here

    We can continue on the last code and perform a perspective transformation to revert the shear:

    coeffs = find_coeffs(
            [(0, 0), (256, 0), (256, 256), (0, 256)],
            [(0, 0), (256, 0), (new_width, height), (xshift, height)])
    
    img.transform((width, height), Image.PERSPECTIVE, coeffs,
            Image.BICUBIC).save(sys.argv[3])
    

    Resulting in:

    enter image description here

    You can also have some fun with the destination points:

    enter image description here enter image description here