pythonimagematrixdimensions

Matrix multiplication different shapes


My goal is to use Python to make several transformations to a matrix representing an image. In particular, I would like the transformations to be achieved via matrix multiplication. I insist on the latter, because I'm currently teaching matrices to my students, and the question I bring forth here is related to a mini-project I want to do with them about the possible uses of matrices. I would thus like them to use matrix multiplication.

Here's an example of what I'd like to do. Let's take the following matrix, the elements of which represent a pixel (At this point, the elements of the matrix are integers between 0 (representing black) and 255 (representing white), so it's a grayscale image) :

image_gray = [
    [0,200,3,48],
    [30,155,255,7],
    [1,218,92,111],
    [175,123,67,6]
]

Let's also consider the following "swap matrix" :

swapmat = [
    [0,0,0,1],
    [0,1,0,0],
    [0,0,1,0],
    [1,0,0,0]
]

Mathematically, doing the matrix multiplication image1 x swapmat will result in the same matrix as image1, but with columns 1 and 4 exchanged. To do such a multiplication in Python, I thought of using numpy.matmul(image_gray,swapmat), and this works fine. However, eventually, I would like to be able to work with any image, the pixels of which are described not just by integer values, but by a triplet representing rgb values. Such an image would be, for example :

image_rgb = [
    [[12, 50, 201], [34, 120, 55], [255, 0, 0], [3,4,5]],
    [[0, 255, 0], [100, 100, 100], [50, 50, 50], [1,2,3]],
    [[0, 0, 255], [20, 30, 40], [100, 200, 250],[60,75,98]],  
    [[0, 4, 25], [23, 37, 46], [30, 240, 232],[6,7,8]]
]

Now, I can't use numpy.matmul(image_rgb,swapmat), because the shape of image_rgb is now (4,4,3) instead of (4,4). How would you do this in a proper way in Python ?

I must also mention that I would like the code to run quite "fast". My final goal is to take a random image of reasonable size (Say, (1024,1024,3)), to apply to it about 100 rows/columns exchanges in order to destructurate the image, and the goal for the students would be to apply all the transformations in reverse order to recover the original image. So this should run in a few seconds on an average laptop.


Solution

  • Per the documentation, an argument to np.matmul with more than 2 dimensions is treated as a stack of matrices residing in the last two indexes. So you just have to move the axis of your image matrix around to make the color channel the first one, then do matrix multiplication and move the axis back:

    image_rgb_moved = np.moveaxis(image_rgb, -1, 0)
    swapped_moved = np.matmul(image_rgb_moved, swapmat)
    swapped = np.moveaxis(swapped_moved, 0, -1)