I was recently performing a complexity analysis on two separate methods for generating the subsets of a given set. For this question, I am removing all unessential components of the code, meaning that both of these functions are no more than "dummy" examples with no real purpose. Assume that lst
represents a list of size n
and curr
represents the current subset and that each method will be initially called with i
set to 0
.
Method 1:
def meth_1(i):
if i == len(lst):
return
curr.append(lst[i])
meth_1(i + 1)
curr.pop()
meth_1(i + 1)
Method 2:
def meth_2(i):
if i == len(lst):
return
for j in range(i, len(lst)):
curr.append(lst[j])
meth_2(j + 1)
curr.pop()
While both of these implementations appear to generate the same result, I am having a hard time rigorously analyzing their worst-case time complexities. As a general approach to analyzing the time complexity of recursion, I sum the total number of work across all of the levels of the recursion tree; I am aware that in more complicated examples, each recursive call may complete different amounts of work, though there are easy ways to get around this.
Applying this approach to meth_1
is routine. Drawing a tree diagram is perhaps the most intuitive way of going about it:
The diagram above has three columns for each level of the recursion tree. Nodes represent the total number of nodes on the level, level represents the tree level (zero-indexed), and total work represents the total work completed on that level. Here, we assume that the non-recursive work of a given call is a constant c
. We know that our recursive tree will have n + 1
total levels since each level will increment the i
by 1
until value of n
is reached. It is also worth noting that the total work completed on each level can be modeled by $\sum_{i=0}^{n+1} 2^ic = \frac{c(1-2^{n+1})}{-1} = O(2^n)$.
However, I am having a hard time attempting to analyze the time complexity of meth_2
. In this case, it is much harder to draw out the recursion diagram for several reasons. First, each function call will execute a variable number of recursion calls, unlike meth_1
, where each call makes two subsequent recursive calls. Likewise, each of meth_2
's recursive calls will be on a different "level"; this contrasts with meth_1
, where each subsequent recursive call for a call on level i
was on level i + 1
. I write all of this to ask: how do I go about analyzing the time complexity of meth_2
?
We can determine the time complexity of meth_2
using inductive reasoning.
Let's assume that the length of lst
is n
.
We will denote the time complexity of meth_2(i)
as f_i
.
Since meth_2(n)
stop at the first condition, the number of operations it performs is 1
.
We will assume that f_i = 2^(n-i)
for each k < i <= n
. You can see that f_k=1+f_(k+1)+f_(k+2)...+f_k(n) = 1+2^(n-k-1)+2^(n-k-2)+2^(n-n) = 1 + 2^0 + ... + 2^(n-k-1) = 1 + 2^(n-k) - 1 = 2^(n-k)
.
Because f_n = 1 = 2^(n-n)
if we will set k=n-1
this relationship holds for each k < i <= n
(it only i=n
), so it also true for i=n-1
and we can apply this until i=0
. Since we run the meth_2
with i = 0
the time complexity of the function is f_0 = 2^(n-0) = O(2^n)
.
Inductive reasoning is one of the most important tools to analyze the complexity of recursive functions. For many functions, you can predict the order of magnitude of their complexity by counting the number of function calls for small values of n
and then validate that prediction by applying inductive reasoning.