I have a 2D array of (4,5) and another 2D array of (4,2) shape. The second array contains the start and end indices that I need to filter out from first array i.e., I want to slice the first array using second array.
np.random.seed(0)
a = np.random.randint(0,999,(4,5))
a
array([[684, 559, 629, 192, 835],
[763, 707, 359, 9, 723],
[277, 754, 804, 599, 70],
[472, 600, 396, 314, 705]])
idx = np.array([[2,4],
[0,3],
[2,3],
[1,3]
])
Expected output - can be either of following two formats. Only reason for padding with zeros is that variable length 2d arrays are not supported.
[[629, 192, 835, 0, 0],
[763, 707, 359, 9, 0],
[804, 599, 0, 0, 0],
[600, 396, 314, 0, 0]
]
[[0, 0, 629, 192, 835],
[763, 707, 359, 9, 0],
[0, 0, 804, 599, 0],
[0, 600, 396, 314, 0]
]
Another possible solution, which uses:
np.arange
to create a range of column indices based on the number of columns in a
.
A boolean mask m
is created using logical operations to check if each column index falls within the range specified by idx
. The np.newaxis
is used to align dimensions for broadcasting.
np.where
is used to create a_mask
, where elements in a
are replaced with 0 if the corresponding value in m
is False
.
np.argsort
is used to get the indices that would sort each row of m
(negated) in ascending order.
np.take_along_axis
is used to rearrange the elements of a_mask
based on the sorted indices.
cols = np.arange(a.shape[1])
m = (cols >= idx[:, 0, np.newaxis]) & (cols <= idx[:, 1, np.newaxis])
a_mask = np.where(m, a, 0)
sort_idx = np.argsort(~m, axis=1)
np.take_along_axis(a_mask, sort_idx, axis=1)
NB: Notice that a_mask
contains the unsorted version of the solution (that is essentially the approach followed by @mozway).
Output:
array([[629, 192, 835, 0, 0],
[763, 707, 359, 9, 0],
[804, 599, 0, 0, 0],
[600, 396, 314, 0, 0]])
# a_mask
array([[ 0, 0, 629, 192, 835],
[763, 707, 359, 9, 0],
[ 0, 0, 804, 599, 0],
[ 0, 600, 396, 314, 0]])