pythoncpythonreference-counting

Why does getrefcount increase by 2 when put inside a function?


Consider the following code:

import sys

a = [1, 2, 3]

def foo(x):
    print(sys.getrefcount(x))

foo(a) # prints out 4 -- but why?

When we invoke foo(a) and when print(sys.getrefcount(x)) executes, the array [1, 2, 3] is referenced by:

  1. variable a
  2. the parameter x of the function foo
  3. the parameter of sys.getrefcount

I expected 3 to be printed out. What have I missed?


Solution

  • Disclaimer: I am far from an expert in CPython internals. This answer is the best of my understanding of what's going on from having skimmed the CPython source code. It may not be the whole story, or applicable to all cases.


    The reference you're missing is the local variable x, which is different from the argument reference.

    Internally, the call to foo is carried out by invoking PyObject_Call with a tuple of the function arguments. The strong reference held by that tuple is the one reference that gets added in the case of calling sys.getrefcount(x). However, calling foo is different because the arguments aren't acted on directly in the C code but instead need to be made available as local variables that are accessible during the subsequent bytecode execution. The structure that stores the Python-accessible local variables keeps its own strong reference to the argument objects. Hence why the reference count gets incremented a second time when calling a Python function.