`I have a list of arrays like this (in x, y coordinates):
coordinates= array([[ 300, 2300],
[ 670, 2360],
[ 400, 2300]]), array([[1500, 1960],
[1620, 2200],
[1505, 1975]]), array([[ 980, 1965],
[1060, 2240],
[1100, 2250],
[ 980, 1975]]), array([[ 565, 1940],
[ 680, 2180],
[ 570, 1945]])]
I want the arrays to be sorted by the nearest neighbor. So, the last coordinate of the first array should be close to the first coordinate of the next array and so forth.
I extracted all of the first and last coordinates and put them in two lists using python numpy. Then tried to use sklearn.neighbors NearestNeighbors, but it didn't work.`
expected output:
coordinates= array([[ 300, 2300],
[ 670, 2360],
[ 400, 2300]]), array([[ 565, 1940],
[ 680, 2180],
[ 570, 1945]]), array([[ 980, 1965],
[1060, 2240],
[1100, 2250],
[ 980, 1975]]), array([[1500, 1960],
[1620, 2200],
[1505, 1975]])]
This looks like a traveling_salesman_problem
. Compute the pairwise distance between all nodes, make the self distance NaN/Inf, then get the shortest path:
from numpy import array
import networkx as nx
coordinates= [array([[ 300, 2300],
[ 670, 2360],
[ 400, 2300]]),
array([[1500, 1960],
[1620, 2200],
[1505, 1975]]),
array([[ 980, 1965],
[1060, 2240],
[1100, 2250],
[ 980, 1975]]),
array([[ 565, 1940],
[ 680, 2180],
[ 570, 1945]])]
# extract starts and ends
starts = np.vstack([a[0] for a in coordinates])
ends = np.vstack([a[-1] for a in coordinates])
# compute pairwise distances, except self
dist = np.sqrt(((starts[:, None] - ends)**2).sum(-1))
np.fill_diagonal(dist, np.nan)
# build the graph
G = nx.from_numpy_array(np.round(dist, 1), create_using=nx.DiGraph)
G.remove_edges_from(nx.selfloop_edges(G))
# find shortest path (NB. cycle could be False)
path = nx.approximation.traveling_salesman_problem(G, cycle=True)
# [1, 0, 3, 2, 1]
out = [coordinates[i] for i in path[1:]]
If you don't necessarily want a cycle, use cycle=False
and out = [coordinates[i] for i in path]
.
Output:
[array([[ 300, 2300],
[ 670, 2360],
[ 400, 2300]]),
array([[ 565, 1940],
[ 680, 2180],
[ 570, 1945]]),
array([[ 980, 1965],
[1060, 2240],
[1100, 2250],
[ 980, 1975]]),
array([[1500, 1960],
[1620, 2200],
[1505, 1975]])]
Intermediate dist
:
array([[ nan, 1248.05849222, 753.67433285, 446.01008957],
[1151.34703717, nan, 520.21630117, 930.12095988],
[ 669.79474468, 525.09522946, nan, 410.48751504],
[ 396.01136347, 940.65137006, 416.47328846, nan]])
Graph (with distances as edge labels and the shortest path in red):