pythonsumtuplespython-itertools

Concatenate tuples using sum()


From this post I learned that you can concatenate tuples with sum():

>>> tuples = (('hello',), ('these', 'are'), ('my', 'tuples!'))
>>> sum(tuples, ())
('hello', 'these', 'are', 'my', 'tuples!')

Which looks pretty nice. But why does this work? And, is this optimal, or is there something from itertools that would be preferable to this construct?


Solution

  • the addition operator concatenates tuples in python:

    ('a', 'b')+('c', 'd')
    Out[34]: ('a', 'b', 'c', 'd')
    

    From the docstring of sum:

    Return the sum of a 'start' value (default: 0) plus an iterable of numbers

    It means sum doesn't start with the first element of your iterable, but rather with an initial value that is passed through start= argument.

    By default sum is used with numeric thus the default start value is 0. So summing an iterable of tuples requires to start with an empty tuple. () is an empty tuple:

    type(())
    Out[36]: tuple
    

    Therefore the working concatenation.

    As per performance, here is a comparison:

    %timeit sum(tuples, ())
    The slowest run took 9.40 times longer than the fastest. This could mean that an intermediate result is being cached.
    1000000 loops, best of 3: 285 ns per loop
    
    
    %timeit tuple(it.chain.from_iterable(tuples))
    The slowest run took 5.00 times longer than the fastest. This could mean that an intermediate result is being cached.
    1000000 loops, best of 3: 625 ns per loop
    

    Now with t2 of a size 10000:

    %timeit sum(t2, ())
    10 loops, best of 3: 188 ms per loop
    
    %timeit tuple(it.chain.from_iterable(t2))
    1000 loops, best of 3: 526 µs per loop
    

    So if your list of tuples is small, you don't bother. If it's medium size or larger, you should use itertools.