python image-processing computer-vision scikit-image rescale

skimage.transform.rescale adds an extra dimension to input image

When using skimage.transform.rescale() I get an output image array of which has 3 dimensions when my input image is just 2 dimensions.

from skimage import io, color, transform

image = io.imread(r'C:\Users\ParthD\PycharmProjects\pythonProject\test_images\6.png')
image_bw = color.rgb2gray(color.rgba2rgb(image))
image_rescaled = transform.rescale(image, scale=0.5, anti_aliasing=True)

print(image_bw.shape)
print(image_rescaled.shape)

To this I get the output as:

(397, 602)
(198, 301, 2)

Where is that additional dimension with value 2 is getting added up, I'm not sure. I checked with the rescale function documentation but there is no parameter which contributes to this extra dimension.

Solution

So the problem is that the channel dimension is interpreted as a spatial dimension.

You should pass transform.rescale the multichannel=True flag so it won't affect the channels:

image_rescaled = transform.rescale(image, scale=0.5, anti_aliasing=True, multichannel=True)

Example:

q = np.zeros((397, 602, 3))
x1 = transform.rescale(q, scale=0.5, anti_aliasing=True)
x2 = transform.rescale(q, scale=0.5, anti_aliasing=True, multichannel=True)
x1.shape  # (198, 301, 2)
x2.shape  # (198, 301, 3)

So transform.rescale treats your array as a 3D image of shape (397, 602, 3), which goes down to (198, 301, 2), interpolating along with the channel as well, as if they were another spatial dimension.

In case your image is gray image, with no channel dimension, you do not need to pass the multichannel=True flag. This will cause the last axis to be treated as channel and you get undesired output.

Example:

q1 = np.zeros((397, 602))
x3 = transform.rescale(q1, scale=0.5, anti_aliasing=True, multichannel=True)
x4 = transform.rescale(q1, scale=0.5, anti_aliasing=True)
x3.shape  # (198, 602)
x4.shape  # (198, 301)

You can refer to the docs