pythonpython-3.xbinary-treeexpression-treesn-ary-tree

Accessing x levels of n-ary tree to manipulate a node and add child node with recursion


I have a problem I'm trying to solve where I can take in a variety of expression of different lengths and complexities and insert them as nodes into an n-ary (or non-binary) tree.

I need to pinpoint where to look or what to look for to start that recursion.

The expressions can vary as:

invalid_expr = ["(", "SOME", "&", "THING", ")", "|", "INVALID", "&", "(", "GOES", "|", "IN", "|", "HERE", ")"]
simple_expr = ["VERY", "&", "SIMPLE", "&", "EXPRESSION", "&", "(", "NO", "|", "BIG", "|", "DEAL", ")"]
complicated_expr = ["(", "MORE", "&", "(", "COMPLICATED", "|","(","EXPRESSION","&","PRESENTING",")","|", "MANY", ")", ")", "|", "(", "DEEPER", "&", "(", "LEVELS", "|", "FOR", "|", "TREE", ")",")"]

The tree class is relatively simple:

class NonBinTree:

    def __init__(self, val):
        self.val = val
        self.nodes = []

    def add_node(self, val):
        self.nodes.append(NonBinTree(val))
    
    def add_child_node(self, parent, val):
        self.nodes[parent].add_node(val)

    def make_parent_node(self, parent, val):
        self.nodes[parent] = NonBinTree(val)

    
    def __repr__(self):
        return f"NonBinTree({self.val}): {self.nodes}"

The code functions to first taken an expression, count items within brackets as whole list which goes in a node and items outside of the bracket as a singular string node. The operator that is the common operator will be the parent node, and the non-operator strings or lists will be children nodes. The logic must traverse each layer deeper to access each child, evaluate the string, replace the string parent with the common operator and add string children and list children. The algorithm to do this is:

pre_inserted_nodes, operator = create_nodes(an_expr)
a = NonBinTree(list(operator)[0])
for i in pre_inserted_nodes:
    if i != list(operator)[0]:
        a.add_node(i)
i=0
while i < len(pre_inserted_nodes):
    if isinstance(pre_inserted_nodes[i], list):
        pre_inserted_nodes[i].pop(0)
        pre_inserted_nodes[i].pop()
        new_pre_inserted_nodes, new_operator = create_nodes(pre_inserted_nodes[i])
        temp_new_pre_inserted_nodes = copy.deepcopy(new_pre_inserted_nodes)
        for node_index in range(len(temp_new_pre_inserted_nodes)):
            if not isinstance(temp_new_pre_inserted_nodes[node_index], list):
                if temp_new_pre_inserted_nodes[node_index] == list(new_operator)[0]:
                    new_pre_inserted_nodes.pop(node_index)
        a.make_parent_node(i, list(new_operator)[0])
        for node in new_pre_inserted_nodes:
            a.add_child_node(i, node)
    i += 1

The trees can become very complicated, and in some instances (like in complicated_expr) it can have 5 levels that will need to be accessed. E.g.: NonBinTree(|): [(&): [(More): [], (|): [(Complicated): [], (&): [(Expressions): [], (Presenting): []], (More): []]], (&): [(Deeper): [], (|): [(Levels): [], (For): [], (Tree): []]]]

So it can vary from a.node[0] to a.node[0].node[0].node[1].node[1].add_node(element).

I'm struggling how to handle the expression going from a singular node level to 4 node levels.

I guess the TLDR is, how do I add an extra .node[x] to an existing string of .node[0].node[y].node[z]


Solution

  • You could use a recursive function for this. When you encounter an opening parenthesis, make a recursive call, and when encountering a closing parenthesis (or the end of the input), return out of a call.

    I would simplify your class to just the essential.

    Here is a possible implementation:

    class Node:
        def __init__(self, val, nodes=None):
            self.val = val
            self.nodes = nodes or []
    
        def __repr__(self):
            return f"Node({self.val}): {self.nodes}"
    
    def expr_to_tree(expr):
        it = iter(expr)
    
        def q(s):
            return "end-of-input" if s is None else f"'{s}'"
        
        def get_operand():
            operand = next(it, None)
            if operand in (")", "&", "|", None):
                raise ValueError(f"Expected an operand, but got {q(operand)}")
            return get_expr(")") if operand == "(" else Node(operand)
    
        def get_expr(terminal):
            node = get_operand()
            operator = next(it, None)
            if operator in ("&", "|"):
                node = Node(operator, [node])
                while operator == node.val:
                    node.nodes.append(get_operand())
                    operator = next(it, None)
            if operator != terminal:
                raise ValueError(f"Expected {q(node.val)}, or {q(terminal)}, but got {q(operator)}")
            return node
    
        return get_expr(None)
    

    Example call:

    print(expr_to_tree(complicated_expr))
    

    This would output:

    Node(|): [Node(&): [Node(MORE): [], Node(|): [Node(COMPLICATED): [], Node(&): [Node(EXPRESSION): [], Node(PRESENTING): []], Node(MANY): []]], Node(&): [Node(DEEPER): [], Node(|): [Node(LEVELS): [], Node(FOR): [], Node(TREE): []]]]