c++algorithmtime-complexitybinary-tree

Optimally counting number of nodes in a complete binary tree


I am looking at LeetCode problem 222. Count Complete Tree Nodes:

Given the root of a complete binary tree, return the number of the nodes in the tree.

According to Wikipedia, every level, except possibly the last, is completely filled in a complete binary tree, and all nodes in the last level are as far left as possible. It can have between 1 and 2h nodes inclusive at the last level h.

Design an algorithm that runs in less than O(n) time complexity.

I am targeting time complexity O(log(n)), however in the worst case, which as per my understanding would be when the last level has a single node, the complexity would be O(n*log(n)). But this would not meet the required "less than O(n)". What am I missing here?

Analysis

The brute-force approach to counting the number of nodes in general, which is traversing through the entire tree and keeping count of the number of nodes encountered, is O(n), where n is the number of nodes.

To improve on it, we could make use of the information that the tree is a complete binary tree and has O(log(n)) levels. So that we can compute left height and right height. In case these come out to be equal for a subtree root, then that subtree is a full binary tree and the number of nodes have a closed form solution as 2^h-1, where h is the height of subtree.

In code, this would look something like:

int findLeftHeight(BinaryTreeNode<int>* root){
    int leftHeight = 0;
    while(root){
        leftHeight += 1;
        root = root->left;
    }
    return leftHeight;
}

int findRightHeight(BinaryTreeNode<int>* root){
    int rightHeight = 0;
    while(root){
        rightHeight += 1;
        root = root->right;
    }
    return rightHeight;
}

int countCompleteNodes(BinaryTreeNode<int>* root){
    if(root == NULL){
        return 0;
    }
    int lh = findLeftHeight(root);
    int rh = findRightHeight(root);

    if(lh == rh) return (1<<lh)-1;

    return 1 + countCompleteNodes(root->left) + countCompleteNodes(root->right);
}

But for the worst case, when the complete binary tree has just one node in the last level, with the number of levels being maximum permissible for the upper bound on n. Then it seems that the time it takes to count is O(n*log(n)), because at each recursive step we are making a call on both root->left and root->right.

What am I missing here?


Solution

  • Then it seems that the time it takes to count is O(n*log(n)), because at each recursive step we are making a call on both root->left and root->right.

    But realise that at least one of these two recursive calls will have lh == rh. It cannot happen that this is false for both subtrees, as that would imply the binary tree is not a complete one. For instance, it would be false for both subtrees 𝐴 and 𝐵 here:

                  _ * _
                 /     \
                A       B
               / \     / \
              *   *   *   *
             / \     /
            *   *   *
    

    ...which is not a complete binary tree.

    Concluding, although two recursive calls are made, we can be sure that at least one of them will not have any deeper recursive calls, so it will execute with O(1) time complexity. The remaining one might have deeper recursive calls, but always along one path, narrowing down to the spot where the bottom layer has its rightmost leaf.