pythonpython-internals

Storage of floating point numbers in memory in Python


I know that Python maintains an internal storage of small-ish integers rather than creating them at runtime:

id(5)
4304101544

When repeating this code after some time in the same kernel, the id is stable over time:

id(5)
4304101544

I thought that this wouldn't work for floating point numbers because it can't possibly maintain a pre-calculated list of all floating point numbers.

However this code returns the same id twice.

id(4.33+1), id(5.33)
(5674699600, 5674699600)

After some time repeating the same code returns some different location in memory:

id(4.33 + 1), id(5.33)
(4962564592, 4962564592)

What's going on here?


Solution

  • It's not just that the object is garbage collected and and the new object stored in the same location as the previous one after garbage collection.

    Something different is at work here.

    We can use the dis module to look at the bytecode generated:

    import dis
    
    def f():
        one, two = 4.3333333, 3.3333333 + 1.
        a, b = id(one), id(two)
        return one, two, a, b
    
    dis.dis(f)
    one, two, a, b = f()
    

    shows us the bytecode generated:

      1           0 RESUME                   0
     
      2           2 LOAD_CONST               1 ((4.3333333, 4.3333333))
                  4 UNPACK_SEQUENCE          2
                  8 STORE_FAST               0 (one)
                 10 STORE_FAST               1 (two)
    
      3          12 LOAD_GLOBAL              1 (NULL + id)
                 24 LOAD_FAST                0 (one)
                 26 PRECALL                  1
                 30 CALL                     1
                 40 LOAD_GLOBAL              1 (NULL + id)
                 52 LOAD_FAST                1 (two)
                 54 PRECALL                  1
                 58 CALL                     1
                 68 STORE_FAST               3 (b)
                 70 STORE_FAST               2 (a)
    
      4          72 LOAD_FAST                0 (one)
                 74 LOAD_FAST                1 (two)
                 76 LOAD_FAST                2 (a)
                 78 LOAD_FAST                3 (b)
                 80 BUILD_TUPLE              4
                 82 RETURN_VALUE
    (4.3333333, 4.3333333, 12424698960, 12424698960)
    

    The id of one and two are also stable over time:

    >>> id(one), id(two)
    (12424698960, 12424698960)
    

    They are indeed the same object, because the interpreter optimizes the addition before the bytecode is generated.