I have a list of thousands of couples of float values (reward, risk).
I want to extract the top couples, i.e. best reward with lowest risk.
Note to financial experts: it is a bit similar to an efficient frontier, but there is neither mean nor standard deviation. A sample of my data points with a representation of the cloud:
import numpy as np
import matplotlib.pyplot as plt
# first value is reward, second is risk
cloud = np.array([[1,2],[4,3],[5.5,2.3],[4,2],[3,3],[.9,1.9],[4,3],[4,3.2],[3,2.2],[2,2.6]])
plt.scatter(cloud[:,1], cloud[:, 0])
plt.xlabel("risk")
plt.ylabel("reward")
I expect an array with [.9, .9], [4, 2] and [5.5, 2.3]
I can do it with a loop, but it is not smart and may be not efficient...
I wrote a first attempt, not sure it is the best one.
If it can help anybody or be improved when dealing with large cloud of points:
import numpy as np
import matplotlib.pyplot as plt
# first value is reward, second is risk
cloud = np.array([[1,2],[4,3],[5.5,2.3],[4,2],[3,3],[.9,1.9],[.9,1.9], [4,3],[4,3.2],[3,2.2],[5.5,2.3],[2,2.6]])
def extract_border(cloud):
""" Extract all couples of points where first value is the highest and second value is the lowest """
# if some couples are similar, only one is recorded
# function takes the cloud of points and returns the border array
# initial cloud is left unchanged as we use a local version of it in the function
if cloud.shape[0] == 0: # cloud is empty
border = []
else:
border = np.zeros((cloud.shape))
for i in range(cloud.shape[0]): # all points may be best couples
if cloud.shape[0] > 0: # some points are still remaining
idx_max = np.argmax(cloud[:, 0])
border[i, :] = cloud[idx_max, :] # record the current best couple
cloud = np.squeeze(cloud[np.where(cloud[:, 1] < cloud[idx_max, 1]), :], axis=0) # remove all bad couples
else: # no more points remaing in the cloud
break
border = border[:i, :] # reduce the border size to only valid couples
return border
border = extract_border(cloud)
print(f"final border: \n reward risk \n {border}")