Let's say I have a 1-d array of size n
, where each index is associated to a certain index in increasing order from 0
to m-1
with m < n
, like this for instance:
[a1, a2, a3, a4, ..., aN] # my array
[0, 1, 1, 1, ..., m-1] # associated indices
You can think of it as the array being divided into m
bins. Importantly, they are not necessarily all the same size. The question is how to get the array of size m
containing the minimum of each bin. Everything I thought of involves either a non-numpy loop or list, or adding dummy data (presumably resulting in useless traversals, in practice with my actual data this results in lots of extra dummy data):
mins = np.zeros(m)
for i, (begin, end) in enumerate(bins):
mins[i] = np.min(myarray[begin:end])
mask
of size (m * bin_size)
that is True
for non-dummy dimensions, and a fake_array
of the same shape initialized to a big constant:fake_array[mask] = myarray
fake_array_2d = fake_array.reshape((m, bin_size))
mins = np.min(fake_array_2d, axis=1)
(m, n)
matrix (again initialized to a big number):np.put_along_axis(fake_matrix, indices[None, :], myarray, axis=0) # took me a while to get that one right
mins = np.min(fake_matrix, axis=1)
Is there a better option?
Taking inspiration from the answer posted by jin-pendragon, I found a solution using np.minimum.reduceat
. You need an array of length m
containing the slice starting points (so the first element is always 0
), let's call it start_indices
, and then you can do:
mins = np.minimum.reduceat(myarray, start_indices)