pythonrecursiontree

Find full path of element in tree


I would like to find full path of element in tree. Element can be located in few places.

tree

My current code:

levels = [
    {"L3A": ["L4A"]},
    {"L3B": ["L4B"]},
    {"L3C": ["L4C"]},
    {"L1": ["L2", "L4A"]},
    {"L2": ["L3A", "L3B", "L3C"]}
]

def get_level(name):
    tree = []
    recursive(name, tree)
    return tree

def recursive(name, tree):
    for level in levels:
        for k, v in level.items():
            if name in v:
                tree.append(k)
                recursive(k, tree)

levl = get_level("L4A")
print(levl)

Result

is: ['L3A', 'L2', 'L1', 'L1']

want: [['L3A', 'L2', 'L1'], ['L1']]

Ultimately want:

L4A in L1 > L2 > L3A

L4A in L1 

Could you give me some advise how to change it?


Solution

  • Why does L1 appear twice in your list? Because you have two paths that lead to L4A: L1 -> L2 -> L3A -> L4A and L1 -> L4A, but only one path variable to store those paths. Since you use a kind of reverse DFS, you have the levels: L4 -> L3A -> L2 -> L1, then L4 -> L1.

    Let's try to elaborate an algorithm. You are dealing with a graph (if you add a root, you get a tree), hence I will use the usual vocbulary: levels are "nodes" and paths between levels are "edges". Here's a good way to proceed:

    It lacks a bit of precision to be a real algorithm. Let's focus:

    GIVEN: a node N
    let paths_to_explore = [N]
    while paths_to_explore is not empty:
        for every path_to_explore:
            try to add a node at the beginning of path_to_explore
            if it fails, yield path_to_explore
    

    Before I get to the code, note that you representation of the graph is none of the two usual representations. In your case, you have list of edges, but a dictionary from_node -> [to_nodes] is better:

    edges = {
        "L3A": {"L4A"},
        "L3B": {"L4B"},
        "L3C": {"L4C"},
        "L1": {"L2", "L4A"},
        "L2": {"L3A", "L3B", "L3C"},
    }
    

    This makes the iteration over edges easier:

    for from_node, to_nodes in edges.items():
        # do something with nodes
    

    Now, the code:

    def find_reverse_path(name):
        paths = []
        paths_to_explore = [[name]]
        while paths_to_explore:
            path = paths_to_explore.pop() # next!
            to_node = path[0] # the HEAD of the current path
            expanded = False
            for from_node, to_nodes in edges.items():
                if to_node in to_nodes: # there's an edge to the HEAD
                    new_path_to_explore = [from_node] + path # new path = from_node + old path
                    paths_to_explore.append(new_path_to_explore) # add it to the exploration list
                    expanded = True # this path was expanded
    
            if not expanded: # the path is maximal
                paths.append(path) # use yield if you want to create a generator
    
        return paths
    
    print(find_reverse_path("L4A"))
    

    Output:

    [['L1', 'L4A'], ['L1', 'L2', 'L3A', 'L4A']]
    

    It's an iterative DFS. (It would we harder to know if the path was expanded in a recursive DFS, I think.)

    Look at those two lines, they contain the "trick":

    new_path_to_explore = [from_node] + path # new path = from_node - old path
    paths_to_explore.append(new_path_to_explore) # add it to the exploration list
    

    Note that new_path_to_explore is a copy of path, that means that we don't just add a node to paths[-1] (in place). Why is that? Look at the first steps:

    1. paths = [[L4A]]
    2. paths = [], path = [L4A]
    3. append L1-L4A to paths, then append L3A-L4A to paths
    4. paths = [[L1, L4A], [L3A, L4A]]
    5. paths = [[L1, L4A]], path = [L3A, L4A]
    ...
    

    If we don't make a copy and if we find several edges to the head of the current path, we would have at step 4 paths = [[L3A, L1, L4]]. That would be nearly the same issue that you had in your code.