Suppose I want to indepedently re-order each row of a matrix. Here is an example of that using np.argsort():
>>> A
array([[88, 44, 77, 33, 77],
[33, 55, 66, 88, 0],
[88, 0, 0, 55, 88],
[ 0, 22, 44, 88, 33],
[33, 33, 77, 66, 66]])
>>> ind = np.argsort(A); ind
array([[3, 1, 2, 4, 0],
[4, 0, 1, 2, 3],
[1, 2, 3, 0, 4],
[0, 1, 4, 2, 3],
[0, 1, 3, 4, 2]])
>>> np.array([A[i][ind[i]] for i in range(ind.shape[0])])
array([[33, 44, 77, 77, 88],
[ 0, 33, 55, 66, 88],
[ 0, 0, 55, 88, 88],
[ 0, 22, 33, 44, 88],
[33, 33, 66, 66, 77]])
The last expression above (the one that uses range()) is one solution to my problem. My question is: Is there a better way to do this?
The inputs are two matrices like A and ind, both 2-dimensional and of the same size. The matrix A can have any values, and the values of ind are interepreted to be column indexes within A. The output is a new matrix the same size as ind whose values are from A as per the above expression: np.array([A[i][ind[i]] for i in range(ind.shape[0])]).
Each row of ind corresponds to the same row in A. Entry B[i,j] of the output comes from entry A[i, ind[i, j]] of the input.
Note that ind may have fewer columns than A, and I would like to support that case.
I'm asking because my solution (the given expression) is essentially using a for loop, and maybe numpy can do this more quickly using some internal loop. For my application, speed is important, so it would be nice if I can be time-efficient.
np.take_along_axis
was added (recently) to facilitate this kind of indexing. I believe its docs mention its use withargsort`:I can do the same with advanced indexing, replacing the
rangeiteration withnp.arange.I had to add a couple of transpose the get the desired output and layout. I haven't taken time to figure out just why those are needed.
Applying the argsort to axis 0 can be done with these two: