pythonalgorithmdata-structureslru

Number of hits in Fibonacci using dynamic programming


I am using functools.lru_cache to implement Fibonacci using memoization. But I do not understand the number of hits that my output generates. My code is as follows:

@lru_cache(maxsize=8)
def fib(n):
    if n <2:
        return n
    else:
        return fib(n-1) + fib(n-2)
print(fib(30))
print(fib.cache_info())

The output is CacheInfo(hits=28, misses=31, maxsize=8, currsize=8) I understand that the number of misses is 31 (fib(0),...fib(30)) But the number of hits should be 58. For example, fib(2) should give a hit for fib(0) and fib(1) and a miss for fib(2). So, every call will generate a miss and 2 hits.

I tried searching the internet but did not get much relevant result.


Solution

  • But the number of hits should be 58.

    Don't forget that when there is a hit, the recursion stops.

    When fib(n-1) + fib(n-2) is evaluated, the fib(n-2) call will immediately have a hit because fib(n-1) already ensured that it was cached, and so no further recursions (nor hits) will occur in the fib(n-2) call. The only exception is when the second term is f(0): The first term had not evaluated f(0), because 1 is considered a base case, and so here the right term still represents a miss and no hit.

    But the general principle (ignoring the exception) is true at every level of recursion: the second term of this sum will only make one hit and nothing else. There is no recursion there. The only recursion happens at the evaluation of the first term.

    Visualisation for f(5):

                           miss __f(5)__
                               /        \
                     miss __f(4)__     f(3) hit
                         /        \
               miss __f(3)__     f(2) hit
                   /        \
         miss __f(2)__     f(1) hit
             /        \ 
     miss f(1)       f(0) miss (the exception)
            |          |
           base       base
     
    

    So for f(5) we have 6 (=n+1) misses and 3 (=n-2) hits. Apply this reasoning to f(30) and you get the reported results.

    NB: the cache needs to retain at least the three most recently accessed results. So even if you would use @lru_cache(maxsize=3) you'd get the same hit and miss counts. The cache becomes useless when choosing a value less than 3.