I have difficulties understanding the behavior of scipy.ndimage.zoom()
when order=0
.
Consider the following code:
import numpy as np
import scipy as sp
import scipy.ndimage
arr = np.arange(3) + 1
print(arr)
for order in range(5):
zoomed = sp.ndimage.zoom(arr.astype(float), 4, order=order)
print(order, np.round(zoomed, 3))
whose output is:
0 [1. 1. 1. 2. 2. 2. 2. 2. 2. 3. 3. 3.]
1 [1. 1.182 1.364 1.545 1.727 1.909 2.091 2.273 2.455 2.636 2.818 3. ]
2 [1. 1.044 1.176 1.394 1.636 1.879 2.121 2.364 2.606 2.824 2.956 3. ]
3 [1. 1.047 1.174 1.365 1.601 1.864 2.136 2.399 2.635 2.826 2.953 3. ]
4 [1. 1.041 1.162 1.351 1.59 1.86 2.14 2.41 2.649 2.838 2.959 3. ]
So, when order=0
the values are (expectedly) not interpolated.
However, I was expecting to have:
[1. 1. 1. 1. 2. 2. 2. 2. 3. 3. 3. 3.]
i.e. exactly the same number of elements for each value, since the zoom is a whole number.
Hence, I was expecting to get the same result as np.repeat()
:
print(np.repeat(arr.astype(float), 4))
[1. 1. 1. 1. 2. 2. 2. 2. 3. 3. 3. 3.]
Why is there a variation in the number of times each element gets repeated?
Note that np.repeat()
does not directly work with multi-dimensional arrays and that is the reason why I would like to get the "correct" behavior from scipy.ndimage.zoom()
.
My NumPy and SciPy versions are:
print(np.__version__)
# 1.17.4
print(sp.__version__)
# 1.3.3
I found this:
`scipy.ndimage.zoom` vs `skimage.transform.rescale` with `order=0`
which points toward some unexpected behavior for scipy.ndimage.zoom()
but I am not quite sure it is the same effect being observed.
This is a bin/edge array interpretation issue.
The behavior of scipy.ndimage.zoom()
is based on the edge interpretation of the array values, while the behavior that would produce equally-sized blocks for integer zoom factors (mimicking np.repeat()
) is based on the bin interpretation.
Let's illustrate with some "pictures".
Consider the array [1 2 3]
, and let's assign each value to a bin.
The edges of each bin would be: 0
and 1
for 1
, 1
and 2
for 2
, etc.
0 1 2 3
|1|2|3|
Now, let's zoom this array by a factor of 4:
1 1 1
0 1 2 3 4 5 6 7 8 9 0 1 2
| 1 | 2 | 3 |
Hence, the values to assign to the bins using the Next-door Neighbor method are:
1 1 1
0 1 2 3 4 5 6 7 8 9 0 1 2
|1 1 1 1|2 2 2 2|3 3 3 3|
Consider the same array as before [1 2 3]
, but now let's assign each value to an edge:
0 1 2
| | |
1 2 3
Now, let's zoom this array by a factor of 4:
1 1
0 1 2 3 4 5 6 7 8 9 0 1
| | | | | | | | | | | |
1 2 3
Hence, the values to assign to the edges using the Next-door Neighbor method are:
1 1
0 1 2 3 4 5 6 7 8 9 0 1
| | | | | | | | | | | |
1 1 1 2 2 2 2 2 2 3 3 3
and edge 3
is assigned to 2
because 2
has position 5.5
while 1
has position 0
and (5.5 - 3 = 2.5) < (3 - 0 = 3)
.
Similarly, edge 8
is assigned to 2
because (8 - 5.5 = 2.5) < (11 - 8 = 3)
.
In Physics, the "bin array interpretation" is generally more useful, because measurements are typically "the result of some integration over a certain bin in an appropriate domain" (notably signal of any form -- including images -- collected at a given time interval), hence I was expecting a "bin interpretation" for scipy.ndimage.zoom()
but I acknowledge that the "edge interpretation" is equally valid (although I am not sure which applications benefit the most from it).
(Thanks to @Patol75 for pointing me into the right direction)