Consider the following code:
import sys
a = [1, 2, 3]
def foo(x):
print(sys.getrefcount(x))
foo(a) # prints out 4 -- but why?
When we invoke foo(a)
and when print(sys.getrefcount(x))
executes, the array [1, 2, 3]
is referenced by:
a
x
of the function foo
sys.getrefcount
I expected 3 to be printed out. What have I missed?
Disclaimer: I am far from an expert in CPython internals. This answer is the best of my understanding of what's going on from having skimmed the CPython source code. It may not be the whole story, or applicable to all cases.
The reference you're missing is the local variable x
, which is different from the argument reference.
Internally, the call to foo
is carried out by invoking PyObject_Call
with a tuple of the function arguments. The strong reference held by that tuple is the one reference that gets added in the case of calling sys.getrefcount(x)
. However, calling foo
is different because the arguments aren't acted on directly in the C code but instead need to be made available as local variables that are accessible during the subsequent bytecode execution. The structure that stores the Python-accessible local variables keeps its own strong reference to the argument objects. Hence why the reference count gets incremented a second time when calling a Python function.