image-processing homography scipy.ndimage

reverse or invertible ndimage.map_coordinates mapping based on planar homography

Suppose I have an image that I want to warp as part of a forward model of a system. In the reverse model, I need to be able to undo the warp. Consider the following:

import numpy as np

from scipy.ndimage import map_coordinates

from matplotlib import pyplot as plt

def make_rotation_matrix(abg, radians=False):
    ABG = np.zeros(3)
    ABG[:len(abg)] = abg
    abg = ABG
    if not radians:
        abg = np.radians(abg)

    alpha, beta, gamma = abg
    cos1 = np.cos(alpha)
    cos2 = np.cos(beta)
    cos3 = np.cos(gamma)
    sin1 = np.sin(alpha)
    sin2 = np.sin(beta)
    sin3 = np.sin(gamma)
    Rx = np.asarray([
        [1,    0,  0   ],  # NOQA
        [0, cos1, -sin1],
        [0, sin1,  cos1]
    ])
    Ry = np.asarray([
        [cos2,  0, sin2],
        [    0, 1,    0],  # NOQA
        [-sin2, 0, cos2],
    ])
    Rz = np.asarray([
        [cos3, -sin3, 0],
        [sin3,  cos3, 0],
        [0,        0, 1],
    ])
    m = Rz@Ry@Rx
    return m


# draw a square in an image
sfe = np.zeros((128,128), dtype=float)
c=64
w=32
sfe[c-w:c+w,c-w:c+w] = 1

# compute the coordinates, translate to the origin, rotate, translate back
xin = np.arange(128)
yin = np.arange(128)
xin, yin = np.meshgrid(xin,yin)

rot = make_rotation_matrix((0,45,0))

ox, oy = 127/2, 127/2
tin = np.eye(4)
tin[0,-1] = -ox
tin[1,-1] = -oy

tout = np.eye(4)
tout[0,-1] = ox
tout[1,-1] = oy

rot2 = np.zeros((4,4), dtype=float)
rot2[:3,:3] = rot
rot2[-1,-1] = 1

M = tout@(rot2@tin)
Mi = np.linalg.inv(M)

points = np.zeros((xin.size, 4), dtype=float)
points[:,0] = xin.ravel()
points[:,1] = yin.ravel()
points[:,2] = 0  # z=0
points[:,3] = 1 # lambda coordinate for homography

out = np.dot(Mi, points.T)

xout = out[0].reshape(xin.shape)
yout = out[1].reshape(yin.shape)
zout = out[2].reshape(xin.shape)
hout = out[3].reshape(xin.shape)

# do I need to do something differently here?
points2 = points.copy()
out2 = np.dot(M, points2.T)

xout2 = out2[0].reshape(xin.shape)
yout2 = out2[1].reshape(yin.shape)
zout2 = out2[2].reshape(xin.shape)
hout2 = out2[3].reshape(xin.shape)

mapped = map_coordinates(sfe, (yout,xout))
unmapped = map_coordinates(mapped, (yout2,xout2))
neighbors = np.hstack((sfe, mapped, unmapped))
plt.imshow(neighbors)

If I perform a clocking rotation instead of an out of plane rotation, I get the behavior I expect:

I understand that by construction I am assuming the image is a birds-eye view or a planar homography, which is OK. What am I missing? Some google related to image warping finds cryptic matlab answers, but I do not understand what the "spatial referencing" is doing.

Edit: An example homography whose inverse does not actually undo the transformation with map_coordinates:

H = np.array([[ 0.063, -0.011,  0.761],
       [ 0.011,  0.063, -0.639],
       [-0.   , -0.   ,  0.063]])

Simply plotting a square with plot.scatter, it does exactly invert.

Solution

I've refactored your code to see what's going on.

For one, map_coordinates() does a "pull", i.e. it pulls source pixels into a result grid, using indices you supply. Those indices need to be generated using a regular grid (for the result) and the inverse of the transformation (to the source frame). That is why your square appears to expand rather than contract.

Then... dropping Z does matter, and where you drop "Z" (inputs/outputs to the 4x4 transformation), especially when inverting a transformation.

Given an out-of-plane rotation, say around Y, you get something like this:

[[ 0.70711  0.       0.70711  0.     ]
 [ 0.       1.       0.       0.     ]
 [-0.70711  0.       0.70711  0.     ]
 [ 0.       0.       0.       1.     ]]

The inverse of that is:

[[ 0.70711  0.      -0.70711  0.     ]
 [ 0.       1.       0.       0.     ]
 [ 0.70711  0.       0.70711  0.     ]
 [ 0.       0.       0.       1.     ]]

If you drop Z in both (and apply that to 2D data, which you have), you now get a pair of transforms that both contract the image:

[[0.70711 0.      0.     ]
 [0.      1.      0.     ]
 [0.      0.      1.     ]]

[[0.70711 0.      0.     ]
 [0.      1.      0.     ]
 [0.      0.      1.     ]]

(In your case, that causes expansion each time because of map_coordinates() and its "pull" operation)

Contraction is the appearance of rotation of in-plane points, but it's not rotation. Dropping Z does not maintain inv(M) @ M == I.

The rotated points, having been rotated out of their plane, have non-zero Z, which is important when you want to rotate them further (e.g. rotate them back). Dropping Z means you no longer have that information. You have to assume their positions in space, and the transformation in 2D has to contract or stretch instead, depending on what plane you assume they come from and where they need to go.

You have to drop Z in M (4x4) first, then invert the resulting 3x3 matrix. Now you have the correct inverse, which expands the image, resulting in an identity transform.

[[0.70711 0.      0.     ]
 [0.      1.      0.     ]
 [0.      0.      1.     ]]

[[1.41421 0.      0.     ]
 [0.      1.      0.     ]
 [0.      0.      1.     ]]

Now here's some code:

def translate4(tx=0, ty=0, tz=0):
    T = np.eye(4)
    T[0:3, 3] = (tx, ty, tz)
    return T

def rotate4(rx=0, ry=0, rz=0):
    R = np.eye(4)
    R[0:3, 0:3] = make_rotation_matrix((rx, ry, rz))
    return R

def dropZ(T4):
    "assumes that XYZW inputs have Z=0 and that the result's Z will be ignored"
    tmp = T4[[0,1,3], :]
    tmp = tmp[:, [0,1,3]]
    return tmp

# input data. don't mind the use of OpenCV. I wasn't in the mood to come up with random() calls to give the square some texture.
im_source = cv.imread(cv.samples.findFile("lena.jpg"), cv.IMREAD_GRAYSCALE)
height, width = im_source.shape[:2]

# transformation: rotate around center
cx, cy = (width-1)/2, (height-1)/2
Tin = translate4(-cx, -cy)
Tout = translate4(+cx, +cy)
R = rotate4(ry=45, rz=30) # with a little in-plane rotation
M = Tout @ R @ Tin

M3 = dropZ(M)
Mi3 = inv(M3)
#print(M3)
#print(Mi3)

# coordinates grid
xin = np.arange(width)
yin = np.arange(height)
xin, yin = np.meshgrid(xin, yin)
zin = np.zeros_like(xin)
win = np.ones_like(xin)

points4 = np.vstack((xin.flatten(), yin.flatten(), zin.flatten(), win.flatten()))
print(points4)

points3 = np.vstack((xin.flatten(), yin.flatten(), win.flatten()))
print(points3)

# always: transform inverted because map_coords is backwards/pull
# can't invert right at the map_coords() call because we've already warped the grid by then

points_warped = inv(M3) @ points3 # apply M3 to identity grid, for input image
print("warped:")
print(points_warped)

points_identity = M3 @ points_warped # apply inv(M3) to warped grid, giving identity grid
# it's equal to M3 @ inv(M3) @ points3
# which is I (identity) @ points3
print("unwarped: (identity grid)")
print(points_identity)

points_unwarping = M3 @ points3 # apply inv(M3) to identity grid, suitable for unwarping *warped* image

# map_coordinates() wants indices, so Y,X or I,J
coords_warped = points_warped.reshape((3, height, width))[[1,0]]
coords_identity = points_identity.reshape((3, height, width))[[1,0]]
coords_unwarping = points_unwarping.reshape((3, height, width))[[1,0]]

im_warped = map_coordinates(im_source, coords_warped)
im_identity = map_coordinates(im_source, coords_identity)
im_unwarped = map_coordinates(im_warped, coords_unwarping)

neighbors = np.hstack((im_source, im_warped, im_identity, im_unwarped))
#neighbors = np.hstack((im1, im2, im3))
plt.figure(figsize=(20,20))
plt.imshow(neighbors, cmap='gray')
plt.show()

Fortunately, this is all linear (not non-linear), and inv(M) @ M == I == M @ inv(M).