pythonnumpyimage-processinglinear-algebratransformation-matrix

2D Rotation of Image


Image while trying to do a 90 degrees rotation. Left one is original image.I am trying to rotate the image for any given angle. I am rotating with the center of the image as the origin.

But the code is not doing the rotation as expected. I am attaching the code below.

import math
import numpy as np
import cv2

im = cv2.imread("Samples\\baboon.jpg", cv2.IMREAD_GRAYSCALE)
new = np.zeros(im.shape,np.uint8)

new_x = im.shape[0] // 2
new_y = im.shape[1] // 2

x = int(input("Enter the angle : "))

trans_mat = np.array([[math.cos(x), math.sin(x), 0],[-math.sin(x), math.cos(x), 0],[0, 0, 1]])

for i in range(-new_x, im.shape[0] - new_x):
    for j in range(-new_y, im.shape[1] - new_y):
        vec = np.matmul([i, j, 1], trans_mat)
        if round(vec[0] + new_x) < 512 and round(vec[1] + new_y) < 512:
            new[round(vec[0]+new_x), round(vec[1]+new_y)] = im[i+new_x,j+new_y]

cv2.imshow("rot",new)
cv2.imshow("1",im)
cv2.waitKey(0)
cv2.destroyAllWindows()

Solution

  • It looks like you are trying to implement a nearest-neighbor resampler. What you are doing is going through the image and mapping each input pixel to a new location in the output image. This can lead to problems like pixels overwriting each other incorrectly, output pixels being left empty, and similar.

    I would suggest (based on experience) that you are looking at the problem backwards. Rather than looking at where an input pixel ends up in the output, you should consider where each output pixel originates in the input. That way, you have no ambiguity about nearest neighbors, and the entire image array will be filled.

    You want to rotate about the center. The current rotation matrix you are using rotates about (0, 0). To compensate for that, you need to translate the center of the image to (0, 0), rotate, and then translate back. Rather than developing the full affine matrix, I will show you how to do the individual operations manually, and then how to combine them into the transform matrix.

    Manual Computation

    First get an input and output image:

    im = cv2.imread("Samples\\baboon.jpg", cv2.IMREAD_GRAYSCALE)
    new = np.zeros_like(im)
    

    Then determine the center of rotation. Be clear about your dimensions x is usually the column (dim 1), not the row (dim 0):

    center_row = im.shape[0] // 2
    center_col = im.shape[1] // 2
    

    Compute the radial coordinates of each pixel in the image, shaped to the corresponding dimension:

    row_coord = np.arange(im.shape[0])[:, None] - center_row
    col_coord = np.arange(im.shape[1]) - center_col
    

    row_coord and col_coord are the distances from center in the output image. Now compute the locations where they came from in the input. Notice that we can use broadcasting to avoid the need for a loop. I'm following your original convention for angle definitions here, and finding the inverse rotation to determine the source location. The big difference here is that the input in degrees is converted to radians, since that's what the trigonometric functions expect:

    angle = float(input('Enter Angle in Degrees: ')) * np.pi / 180.0 
    source_row = row_coord * np.cos(angle) - col_coord * np.sin(angle) + center_row
    source_col = row_coord * np.sin(angle) + col_coord * np.cos(angle) + center_col
    

    If all the indices were guaranteed to fall within the input image, you wouldn't even need to pre-allocate the output. You could literally just do new = im[source_row, source_col]. However, you need to mask the indices:

    mask = source_row >= 0 & source_row < im.shape[0] & source_col >= 0 & source_col < im.shape[1]
    new[mask] = im[source_row[mask].round().astype(int), source_col[mask].round().astype(int)]
    

    Affine Transforms

    Now let's take a look at using Affine transforms. First you want to subtract the center from your coordinates. Let's say you have a column vector [[r], [c], [1]]. A translation to zero would be the matrix

    [[r']    [[1  0 -rc]  [[r]
     [c']  =  [0  1 -cc] . [c]
     [1 ]]    [0  0  1 ]]  [1]]
    

    Then the (backwards) rotation is applied:

    [[r'']    [[cos(a) -sin(a) 0]  [[r']
     [c'']  =  [sin(a)  cos(a) 0] . [c']
     [ 1 ]]    [  0       0    1]]  [1 ]]
    

    And finally, you need to translate back to center:

    [[r''']    [[1  0 rc]  [[r'']
     [c''']  =  [0  1 cc] . [c'']
     [ 1  ]]    [0  0  1]]  [ 1 ]]
    

    If you multiply these three matrices out in order from right to left, you get

       [[cos(a)   -sin(a)    cc * sin(a) - rc * cos(a) + rc]
    M = [sin(a)    cos(a)   -cc * cos(a) - rc * sin(a) + cc]
        [  0         0                      1              ]]
    

    If you build a full matrix of output coordinates rather than the subset arrays we started with, you can use np.matmul, a.k.a. the @ operator to do the multiplication for you. There is no need for this level of complexity for such a simple case though:

    matrix = np.array([[np.cos(angle), -np.sin(angle),  col_center * np.sin(angle) - row_center * np.cos(angle) + row_center],
                       [np.sin(angle),  np.cos(angle), -col_center * np.cos(angle) - row_center * np.sin(angle) + col_center],
                       [0, 0, 1]])
    
    coord = np.ones((*im.shape, 3, 1))
    coord[..., 0, :] = np.arange(im.shape[0]).reshape(-1, 1, 1, 1)
    coord[..., 1, :] = np.arange(im.shape[1]).reshape(-1, 1, 1)
    
    source = (matrix @ coord)[..., :2, 0]
    

    The remainder of the processing is fairly similar to the manual computations:

    mask = (source >= 0 & source_row < im.shape).all(axis=-1)
    new[mask] = im[source[0, mask].round().astype(int), source_col[1, mask].round().astype(int)]