pythonfunctionnestedcopy-on-write

Do Python nested functions copy-on-could-write?


I tested the following code:

def nested(x):
    def inner(x):
        return x*x
    print(id(inner), id(inner.__code__), id(inner.__closure__))
    return inner

nested(3)
x = [list(range(i)) for i in range(5000)] # create some memory pressure
a = nested(3)
x = [list(range(i)) for i in range(5000)] # create some memory pressure
nested(3)

# 139906265032768 139906264446704 8845216
# 139906265032768 139906264446704 8845216
# 139906264258624 139906264446704 8845216

It seems that if Python detects that there is an outer reference to the cached nested function, then it creates a new function object.

Now - assuming my reasoning so far is not completely off - my question: What is this good for?

My first idea was "Ok, if the user has a reference to the cached function, they may have messsed with it, so better make a clean new one." But on second thoughts that doesn't seem to wash because the copy is not a deep copy and also what if the user messes with the function and then throws the reference away?

Supplementary question: Does Python do any other fiendishly clever things behind the scenes? And is this at all related to the slower execution of nested compared to flat?


Solution

  • As for the time difference, a look at the bytecode of the two functions provides some hints. Comparison between nested() and fake_nested() shows that whereas fake_nested just loads already defined global function inner(), nested has to create this function. There will be some overhead here whereas the other operations will be relatively fast.

    >>> import dis
    >>> dis.dis(flat)
      2           0 LOAD_GLOBAL              0 (inner)
                  3 LOAD_FAST                0 (x)
                  6 CALL_FUNCTION            1
                  9 RETURN_VALUE        
    >>> dis.dis(nested)
      2           0 LOAD_CONST               1 (<code object inner at 0x7f2958a33830, file "<stdin>", line 2>)
                  3 MAKE_FUNCTION            0
                  6 STORE_FAST               1 (inner)
    
      4           9 LOAD_FAST                1 (inner)
                 12 LOAD_FAST                0 (x)
                 15 CALL_FUNCTION            1
                 18 RETURN_VALUE        
    >>> dis.dis(fake_nested)
      2           0 LOAD_FAST                0 (x)
                  3 STORE_FAST               1 (y)
    
      3           6 LOAD_FAST                0 (x)
                  9 STORE_FAST               2 (z)
    
      4          12 LOAD_GLOBAL              0 (inner)
                 15 LOAD_FAST                0 (x)
                 18 CALL_FUNCTION            1
                 21 RETURN_VALUE        
    

    As for the inner function caching part, the other answer already clarifies that a new inner() function will be created every time nested() is run. To see this more clearly see the following variation on nested(), cond_nested() which creates same functions with two different names based on a flag. First time this runs with a False flag second function inner2() is created. Next when I change the flag to True the first function inner1() is created and the memory occupied by second function inner2() is freed. So if I run again with True flag, the first function is again created and is assigned a memory that was occupied by second function which is free now.

    >>> def cond_nested(x, flag=False):
    ...     if flag:
    ...         def inner1(x):
    ...             return x*x                                                                                                                           
    ...         cond_nested.func = inner1
    ...         print id(inner1)                                                                                                                         
    ...         return inner1(x)
    ...     else:
    ...         def inner2(x):                                                                                                                           
    ...             return x*x
    ...         cond_nested.func = inner2
    ...         print id(inner2)
    ...         return inner2(x)
    ... 
    >>> cond_nested(2)
    139815557561112
    4
    >>> cond_nested.func
    <function inner2 at 0x7f2958a47b18>
    >>> cond_nested(2, flag=True)
    139815557561352
    4
    >>> cond_nested.func
    <function inner1 at 0x7f2958a47c08>
    >>> cond_nested(3, flag=True)
    139815557561112
    9
    >>> cond_nested.func
    <function inner1 at 0x7f2958a47b18>