pythonpython-3.xoptimizationcpythonpython-3.11

Why sum function is slower if the 'start' argument is an instance of custom class?


I was playing around with sum function and observed the following behaviour.

case 1:

source = """
class A:
    def __init__(self, a):
        self.a = a
    
    def __add__(self, other):
        return self.a + other;

sum([*range(10000)], start=A(10))
"""

import timeit
print(timeit.timeit(stmt=source))

As you can see I am using an instance of custom class as start argument to the sum function. Benchmarking above code takes around 192.60747704200003 seconds in my system.

case 2:

source = """
class A:
    def __init__(self, a):
        self.a = a
    
    def __add__(self, other):
        return self.a + other;

sum([*range(10000)], start=10).  <- Here
"""

import timeit
print(timeit.timeit(stmt=source))

But if I remove the custom class instance and use int object directly it tooks only 111.48285191600007 seconds. I am curious to understand the reason for this speed difference?

My system info:

>>> import platform
>>> platform.platform()
'macOS-12.5-arm64-arm-64bit'
>>> import sys
>>> sys.version
'3.11.0 (v3.11.0:deaf509e8f, Oct 24 2022, 14:43:23) [Clang 13.0.0 (clang-1300.0.29.30)]'

Solution

  • builtin_sum_impl has 2 implementations inside, one if the start is a number which skips creating python "number objects" and just sums numbers in C.

    the other slower implementation when start is not a number, which forces the __add__ method of "number objects" to be called, (because it assumes you are summing some weird classes).

    you forced it to use the slower one.