Hi I have a problem with the following numpy array for example:
a = np.array([[1,2],[2,4],[5,6],[3,4],[7,8],[3,5]])
And I need an output array to be:
[np.array([1,2,2,4,4,3,3,5,5,6]),
np.array([7,8])]
Although the elements in the subarrays and array don't have to be in order.
Pairs should be flipped if they can be part of the subarray in the output. The pairs are guaranteed to have non identical elements.
My crude attempt is below, but it doesn't give the correct output.
import numpy as np
def concatenate_neighbouring_pairs(input):
output = []
for i in range(len(input)):
subarray = input[i]
for j in range(1,len(input)):
#print(i,j)
intersect1 = np.in1d(subarray, input[j])
intersect2 = np.in1d(input[j] ,subarray)
#print(intersect1, intersect2)
if intersect1[0] == True and intersect1[-1] == False and intersect2[0] == True and intersect2[-1] == False:
subarray = np.concatenate((np.flip(input[j]),subarray))
elif intersect1[0] == True and intersect1[-1] == False and intersect2[0] == False and intersect2[-1] == True:
subarray = np.concatenate((input[j], subarray))
elif intersect1[0] == False and intersect1[-1] == True and intersect2[0] == True and intersect2[-1] == False:
subarray = np.concatenate((subarray, input[j]))
elif intersect1[0] == False and intersect1[-1] == True and intersect2[0] == False and intersect2[-1] == True:
subarray = np.concatenate((subarray, np.flip(input[j])))
output.append(subarray)
return output
Calling the function outputs:
[array([1, 2, 2, 4, 4, 3, 3, 5]),
array([2, 4, 4, 3, 3, 5]),
array([3, 5, 5, 6]),
array([5, 3, 3, 4, 4, 2]),
array([7, 8]),
array([4, 3, 3, 5, 5, 6])]
Yes it is better to use graph (as mentioned in comment) some library like networkx
First you get connected components and from each connected component's node (any node would work) you do a DFS over edges.
A one liner can be:
import networkx as nx
G = nx.Graph()
a = np.array([[1,2],[2,4],[5,6],[3,4],[7,8],[3,5]])
G.add_edges_from(a.tolist())
[np.array(list(nx.dfs_edges(G, source=list(s)[0]))).flatten()
for s in nx.connected_components(G)]
output:
[array([1, 2, 2, 4, 4, 3, 3, 5, 5, 6]), array([8, 7])]