algorithmtreebinary-treearithmetic-expressionstree-balancing

Balancing arithmetic expression tree with +, - operators


Given a binary arithmetic expression tree consisting of only addition and subtraction operators, and numbers, how to balance the tree as much as possible? The task is to balance the tree without evaluating the expression, that is the number of nodes should stay the same.

Example:

      +                           +
    /   \                      /     \
   +     15      >>>>>>       -       +
 /   \                       / \     / \
5     -                     6   4   5   15
    /   \
   6     4

Addition is commutative and associative and that allows for balancing. Commutativity allows for swapping of children of consecutive '+' nodes. Associativity allows for rotations. In the above example, the transformation performed can be viewed as

  1. Rotation right on '+' at the root.
  2. Swapping of '5' and '-' nodes.

I was thinking of doing an in order traversal and first balancing any sub-trees. I would try to balance any sub-tree with two consecutive '+' nodes by trying all possible arrangements of nodes (there are only 12 of them) to hopefully decrease the total height of the tree. This method should reduce the height of the tree by at most 1 at any step. However, I cannot determine whether it will always give a tree of minimum height, especially when there are more than 2 consecutive '+' nodes.

Another approach could be to read the expression tree into an array and substitute any '-' subtree with a variable. And then us DP to determine the best places for brackets. This must be done bottom up, so that any '-' subtree is already balanced when it is considered by DP algorithm. However, I am worried because there could be (n+1)! ways to arrange nodes and brackets. While I am looking for an O(n) algorithm.

Is it a known problem and is there a specific approach to it?


Solution

  • At the risk of doing something vaguely like "evaluating" (although it isn't in my opinion), I'd do the following:

    1. Change the entire tree to addition nodes, by propagating negation markers down to the roots. A simple way to do this would be to add a "colour" to every leaf node. The colour of a node can be computed directly during a tree walk. During the walk, you keep track of the number (or the parity, since that's the only part we're interested in) of right-hand links from a - nodes taken; when a leaf is reached, it is coloured green if the parity is even and red if the parity is odd. (Red leaves are negated.) During the walk, - nodes are changed to +.

            +                          +
          /   \                      /   \
         +    15       >>>>>>       +    15
       /   \                      /   \
      5     -                     5    +
          /   \                      /   \
         6     4                    6    -4
      
    2. Now minimise the depth of the tree by constructing a minimum depth binary tree over top of the leaves, taking the leaves in order without regard to the previous tree structure:

            +                            +
          /   \                      /       \
         +    15       >>>>>>       +         +
       /   \                      /   \     /   \
      5     +                     5    6   -4   15
          /   \
         6    -4
      
    3. Turn the colours back into - nodes. The easy transforms are nodes with no red children (just remove the colour) and nodes with exactly one red child and one green child. These latter nodes are turned into - nodes; if the red child is on the left, then the children are also reversed.

      The tricky case is nodes all of whose children are red. In that case, move up the tree until you find a parent which has some green descendant. The node you find must have two children (since otherwise its only child would have to have a green descendant), of which exactly one child has green descendants. Then, change that node to -, reverse its children if the right-hand child has a green descendant, and recolour green all the children of the (possibly new) right-hand child.

              +                          +
          /      \                   /       \
         +        +    >>>>>>       +         -
       /   \     /   \            /   \     /   \
      5     6   -4   15          5     6   15    4
      

      Perhaps it's worth pointing out that the root node has a green descendant on the left-hand side because the very first leaf node is green. That's sufficient to demonstrate that the above algorithm covers all cases.