I want to rank the rows of a 2D matrix lexicographically (in descending order) such that they are ranked: first by the left-most column, then second left-most column, and so on, with equal rankings being shared by rows which have the same value across all columns (making it different to a sorting problem).
I can achieve this using a two-stage approach of sorting the rows lexicographically first, then iterating over the array of sorted indices to assign ranks accordingly:
import numpy as np
A = np.array([[100, 50, 3], #0
[200, 40, 2], #1
[100, 30, 7], #2
[300, 40, 4], #3
[200, 40, 2], #4
[200, 20, 3], #5
[100, 10, 6], #6
[100, 30, 7]]) #7
sorted_rows = np.lexsort(([-A[:,i] for i in range(A.shape[1]-1,-1,-1)]))
rank = 0
ranks = np.array([0 for _ in range(A.shape[0])])
for i in range(1,len(sorted_rows)):
if (A[sorted_rows[i],:] != A[sorted_rows[i-1],:]).any():
rank = rank + 1
ranks[sorted_rows[i]] = rank
print(ranks) gives the output:
[3 1 4 0 1 2 5 4]
which is correct, but I was wondering if there was a better way of doing this in a single step?
Here's one way you can do it, with just a couple steps and no explicit Python loops, but it relies on using a function from SciPy. The idea is to convert each row into a single number, and rank the numbers using
scipy.stats.rankdata. The conversion of each row to a single number is accomplished using the idea of a mixed radix number system.Here's a demonstration in an ipython session. First, your data:
Get the "bases" of mixed radix coefficients for each column; each value is the maximum of the column to the right plus one, and the last value is set to 1.
Form the coefficients that will be used to convert the rows of
Ato the scalar values. This is the cumulative product from right to left ofb:Convert each row to its value using the mixed radix representation, and multiply by -1 because you want the results in descending order:
Use
scipy.stats.rankdatawithmethod='dense'to compute the ranks (one is subtracted from the result becauserankdatastarts its ranks at 1):The short version, with just the code: