data-structurestreediscrete-mathematics

Time Complexity of traversing n-ary tree seems to have multiple correct answers, which is most correct?


enter image description here

Ignoring space complexity, assuming each node in the tree is touched exactly once and considering DFS and BFS traversal time equivalent, what is the time complexity of traversing an n-ary tree?

Given that Big O notation is an asymptotic measure, meaning that we are looking a function that gives us a line or curve that best fits the problem as it is extended to more levels or branches.

My intuition tells me that for tree structures in general we would want a function of the sort b^l where b is the branching factor and l is the number of levels in the tree (for a full and complete tree).

However for a partial tree, it would make sense to take some sort of average of b and l, perhaps AVG(b)^AVG(l).

In looking for this answer I find many people are saying it is O(n) where n is the number of verticies in the tree (nodes -1). See: What is the time complexity of tree traversal? and Complexity of BFS in n-ary tree

But a linear solution in my mind does not model the cost (in time) that the algorithm will take as the tree adds additional levels (on average). Which is what I understand Big O notation is intended to do.


Solution

  • The height or branching factor of a tree are not the determining factors for the complexity of a complete traversal (whether it be BFS or DFS). It is only the number of vertices that is the determining factor.

    If you want to express the complexity in terms of branching factor and height of the tree, then you are really asking what the relation is between these measures and the number of nodes in a tree.

    As you already indicate, if the branching factor is 100 but it is only rarely that a node has more than 3 children, then the branching factor is not really telling us much. The same can be said about the height of the tree. A tree of height 4 and branching factor 2 can have between 5 and 31 nodes.

    What to do then? Often, the worst case will be taken (maximising the number of nodes). This means that you'll translate branching factor and height to a tree that is perfect, i.e. where each node has the maximum number of children, except for the leaves of the tree, which are all on the same level.

    The number of nodes 𝑛 is then 𝑏ℎ+1−1, where 𝑏 is the branching factor, and ℎ the height (number of edges on the longest path from root to leaf).

    That means the worst case time complexity for a given 𝑏 and ℎ is O(𝑏ℎ+1)

    Working with average is not really practical. The distribution of the branching factor may not be linear, and the distribution of leaf-depths might not be linear either, so working with averages is not going to give much insight.

    As to the cost of adding a node: Once the insertion point is determined, the time complexity for adding a node is constant. It does not matter whether that insertion increases the branching factor or increases the height of the tree.

    There is some variation when it comes to finding a node if the tree is a search tree (like a BST or B-tree). In that case the height of the tree becomes important, and a search would cost O(ℎ). If however the tree is self-balancing (like AVL or B-tree), this variation is limited and this complexity would be O(log𝑛)