Find the best couples of values from a scatter plot

80 views Asked by At

I have a list of thousands of couples of float values (reward, risk).

I want to extract the top couples, i.e. best reward with lowest risk.

Note to financial experts: it is a bit similar to an efficient frontier, but there is neither mean nor standard deviation. A sample of my data points with a representation of the cloud:

import numpy as np
import matplotlib.pyplot as plt
# first value is reward, second is risk
cloud = np.array([[1,2],[4,3],[5.5,2.3],[4,2],[3,3],[.9,1.9],[4,3],[4,3.2],[3,2.2],[2,2.6]])
plt.scatter(cloud[:,1], cloud[:, 0])
plt.xlabel("risk")
plt.ylabel("reward")

I expect an array with [.9, .9], [4, 2] and [5.5, 2.3]

I can do it with a loop, but it is not smart and may be not efficient...

1

There are 1 answers

0
tibibou On BEST ANSWER

I wrote a first attempt, not sure it is the best one.

If it can help anybody or be improved when dealing with large cloud of points:

import numpy as np
import matplotlib.pyplot as plt
# first value is reward, second is risk
cloud = np.array([[1,2],[4,3],[5.5,2.3],[4,2],[3,3],[.9,1.9],[.9,1.9], [4,3],[4,3.2],[3,2.2],[5.5,2.3],[2,2.6]])

def extract_border(cloud):
    """ Extract all couples of points where first value is the highest and second value is the lowest """
    # if some couples are similar, only one is recorded
    # function takes the cloud of points and returns the border array
    # initial cloud is left unchanged as we use a local version of it in the function
    if cloud.shape[0] == 0:  # cloud is empty
        border = []
    else:
        border = np.zeros((cloud.shape))
        for i in range(cloud.shape[0]):  # all points may be best couples
            if cloud.shape[0] > 0:  # some points are still remaining
                idx_max = np.argmax(cloud[:, 0])
                border[i, :] = cloud[idx_max, :]  # record the current best couple
                cloud = np.squeeze(cloud[np.where(cloud[:, 1] < cloud[idx_max, 1]), :], axis=0)  # remove all bad couples
            else:  # no more points remaing in the cloud
                break
        border = border[:i, :]  # reduce the border size to only valid couples
    return border
border = extract_border(cloud)
print(f"final border: \n reward   risk \n {border}")