Suppose I want to indepedently re-order each row of a matrix. Here is an example of that using np.argsort()
:
>>> A
array([[88, 44, 77, 33, 77],
[33, 55, 66, 88, 0],
[88, 0, 0, 55, 88],
[ 0, 22, 44, 88, 33],
[33, 33, 77, 66, 66]])
>>> ind = np.argsort(A); ind
array([[3, 1, 2, 4, 0],
[4, 0, 1, 2, 3],
[1, 2, 3, 0, 4],
[0, 1, 4, 2, 3],
[0, 1, 3, 4, 2]])
>>> np.array([A[i][ind[i]] for i in range(ind.shape[0])])
array([[33, 44, 77, 77, 88],
[ 0, 33, 55, 66, 88],
[ 0, 0, 55, 88, 88],
[ 0, 22, 33, 44, 88],
[33, 33, 66, 66, 77]])
The last expression above (the one that uses range()
) is one solution to my problem. My question is: Is there a better way to do this?
The inputs are two matrices like A
and ind
, both 2-dimensional and of the same size. The matrix A
can have any values, and the values of ind
are interepreted to be column indexes within A
. The output is a new matrix the same size as ind
whose values are from A
as per the above expression: np.array([A[i][ind[i]] for i in range(ind.shape[0])])
.
Each row of ind
corresponds to the same row in A
. Entry B[i,j]
of the output comes from entry A[i, ind[i, j]]
of the input.
Note that ind
may have fewer columns than A
, and I would like to support that case.
I'm asking because my solution (the given expression) is essentially using a for loop, and maybe numpy can do this more quickly using some internal loop. For my application, speed is important, so it would be nice if I can be time-efficient.
np.take_along_axiswas added (recently) to facilitate this kind of indexing. I believe its docs mention its use with
argsort`:
In [113]: A=np.array([[88, 44, 77, 33, 77],
...: [33, 55, 66, 88, 0],
...: [88, 0, 0, 55, 88],
...: [ 0, 22, 44, 88, 33],
...: [33, 33, 77, 66, 66]])
...:
...: ind=np.array([[3, 1, 2, 4, 0],
...: [4, 0, 1, 2, 3],
...: [1, 2, 3, 0, 4],
...: [0, 1, 4, 2, 3],
...: [0, 1, 3, 4, 2]])
In [114]: np.take_along_axis(A, np.argsort(A),1)
Out[114]:
array([[33, 44, 77, 77, 88],
[ 0, 33, 55, 66, 88],
[ 0, 0, 55, 88, 88],
[ 0, 22, 33, 44, 88],
[33, 33, 66, 66, 77]])
I can do the same with advanced indexing, replacing the range
iteration with np.arange
.
In [119]: A[np.arange(5),ind.T].T
Out[119]:
array([[33, 44, 77, 77, 88],
[ 0, 33, 55, 66, 88],
[ 0, 0, 55, 88, 88],
[ 0, 22, 33, 44, 88],
[33, 33, 66, 66, 77]]
I had to add a couple of transpose the get the desired output and layout. I haven't taken time to figure out just why those are needed.
Applying the argsort to axis 0 can be done with these two:
In [123]: A[ind,np.arange(5)]
Out[123]:
array([[ 0, 55, 0, 66, 77],
[33, 44, 66, 55, 33],
[33, 0, 44, 33, 66],
[88, 55, 77, 55, 33],
[88, 55, 44, 66, 88]])
In [124]: np.take_along_axis(A, np.argsort(A),0)
Out[124]:
array([[ 0, 55, 0, 66, 77],
[33, 44, 66, 55, 33],
[33, 0, 44, 33, 66],
[88, 55, 77, 55, 33],
[88, 55, 44, 66, 88]])