pythonnetwork-programminggraphgraph-theory

Get the most optimal combination of places from a list of places through graph algorithm


We are having a networking situation where he have four nodes - say A, B, C and D. The problem - A, B, C and D are not fixed, but have a bunch of possibilities.

Say A can have values A1, A2 upto An. Same for B - B1, B2, Bn.

The same for C and D as well. Each of An, Bn, Cn and Dn has coordinates as properties (a tuple - (21.2134, -32.1189)).

What we are trying to achieve - find the best combination of An, Bn, Cn and Dn which are closest to each other if traversed in that order (A -> B -> C -> D). Example, it can be A11 -> B1 -> C9 -> D4 - basically choose one option from each of A, B, C and D from all their respective possible values so that the combination of these values are closest to each other among all such combinations.

We are trying brute force approach of course, but it is very expensive, since its time complexity is N(A) * N(A) * N(C) * N(D).

So we are looking for some graph based solution where we can harness a graph by storing all points with coordinates in it and at runtime we can simply pass in all sets of points (all n points for A, all k points for B, all j points for C and all l points for D) to it and it will output the right point for each of those classes (A, B, C and D) for which the route is most optimal.

If the graph needs to be stored in memory, that's fine as well, the quicker the runtime response time the better, memory is not a constraint, and also, whether a route exists between An and Bn is irrelevant here, all we want to do is find out coordinate based proximity (assuming a linear route between each set of points).

Also, it is not mandatory, in fact, it will be almost the case always, that the runtime query includes only few of the categories of places, like the exhaustive list has A, B, C and D, but a runtime query may try to find combination of just B and C, or A and D, or A, B and D, etc.


Solution

  • One option would be to build a KDTree for each category and query the alternating nearest place :

    from scipy.spatial import KDTree
    
    kdtrees = {cat: KDTree(list(coords.values())) for cat, coords in places.items()}
    
    
    def shortest_comb(places, categories, kdtrees):
        min_dis = float("inf")
    
        for init_name, init_coo in places[categories[0]].items():
            route, total_dis = [init_name], 0
            curr = init_coo
    
            for next_cat in categories[1:]: # from B and further..
                distance, index = kdtrees[next_cat].query(curr)
                following = list(places[next_cat].keys())[index]
                curr = places[next_cat][following]
                route.append(following)
                total_dis += distance
    
            if total_dis < min_dis:
                min_dis = total_dis
                mpr = route
    
        return mpr, min_dis
    

    mpr, min_dis = shortest_comb(places, categories, kdtrees)
    
    print(f"most plausible route: {'->'.join(mpr)}, distance={min_dis/1000:.2f}km")
    # most plausible route: A6->B3->C3->D4, distance=43.50km
    

    enter image description here

    Used inputs:

    import random
    
    random.seed(1)
    
    
    def rnd_points(n, cat, bbox=(21, -32, 22, -31), as_latlon=False):
        min_lon, min_lat, max_lon, max_lat = bbox
        points = [
            (
                random.uniform(min_lat, max_lat),
                random.uniform(min_lon, max_lon)
            ) for _ in range(n)
        ]
        if not as_latlon:
            from pyproj import Transformer
            transformer = Transformer.from_crs(4326, 32734)
            points = [pt for pt in transformer.itransform(points)]
        return {f"{cat}{i}": coo for i, coo in enumerate(points, 1)}
    
    
    categories = "ABCD"
    places = {cat: rnd_points(n, cat) for cat, n in zip(categories, (9, 7, 5, 6))}