pythonslicepython-itertools

Performance of list.extend slice vs islice


It seems that even when islice would theoretically be better, in practice, it is slower than just using slice. So I am a bit puzzled by the difference in performance between the usage of slice and islice here:

from time import perf_counter
from itertools import islice
from random import choices
from string import ascii_letters
import sys

test_str = "".join(choices(ascii_letters, k=10 ** 7))

arr1 = []
t0 = perf_counter()
for _ in range(10):
    arr1.extend(test_str[slice(1, sys.maxsize)])
t1 = perf_counter()
print('%5.1f ms ' % ((t1 - t0) * 1e3))

arr2 = []
t0 = perf_counter()
for _ in range(10):
    arr2.extend(islice(test_str, 1, sys.maxsize))
t1 = perf_counter()
print('%5.1f ms ' % ((t1 - t0) * 1e3))
552.6 ms
786.0 ms

To justify why I believe the difference to be unexpected, here is my mental model for how the execution goes for both cases:

Strict Slicing:

Lazy Slicing:

I don't see how allocating the whole string object at once just to be garbage collected later leads to still better performance, especially considering the fact that both slicing methods are implemented in C.


Solution

  • The additional iteration layer of islice is far more costly than the string slices. Allocation optimization by length hint is insignificant, at least on the three systems where I tried this.

    With the string slices, the extend method iterates directly over the string (slice).

    With islice, it instead iterates over the islice iterator, which in turn iterates over the string. Every single character gets requested and passed through this additional iterator.

    Benchmark results:

                           System 1     System 2     System 3
                          Py 3.11.10   Py 3.13.3    Py 3.13.0
    
    extend_slice          552 ± 2 ms   413 ± 2 ms   823 ±  6 ms
    extend_islice         849 ± 5 ms   648 ± 1 ms  1225 ± 28 ms
    just_slice              4 ± 0 ms     7 ± 0 ms    20 ±  1 ms
    iterate_slice         179 ± 1 ms   231 ± 1 ms   304 ±  8 ms
    iterate_islice        523 ± 1 ms   366 ± 2 ms   680 ±  3 ms
    extend_slice_len      553 ± 3 ms   414 ± 1 ms   840 ± 12 ms
    extend_slice_nolen    558 ± 2 ms   430 ± 1 ms   842 ± 11 ms
    extend_islice_nolen   852 ± 1 ms   649 ± 1 ms  1227 ± 22 ms
    extend_islice_len     848 ± 2 ms   636 ± 1 ms  1195 ± 39 ms
    

    extend_slice and extend_islice are your tests, we see time differences similar to what you saw.

    just_slice just creates the string slices but does nothing with them. Wee see they're very cheap in comparison.

    iterate_slice and iterate_islice don't extend a list, instead they just iterate the slices/islices (in a fast way). We see that the slices get iterated much faster than the islices.

    The last four rows show times for your tests again, but with the slices/islices wrapped in little objects that provide or don't provide length hints. We only see little/unclear effects of having / not having length hints.

    Benchmark code (system 3 has a low time limit, so instead of best 5 of 100 rounds I used best 3 of 7 there):

    from time import perf_counter
    from itertools import islice
    from random import choices
    from string import ascii_letters
    from collections import deque
    from statistics import mean, stdev
    import random
    import sys
    
    
    def extend_slice():
        arr = []
        for _ in range(10):
            arr.extend(test_str[slice(1, sys.maxsize)])
    
    
    def extend_islice():
        arr = []
        for _ in range(10):
            arr.extend(islice(test_str, 1, sys.maxsize))
    
    
    def just_slice():
        for _ in range(10):
            test_str[slice(1, sys.maxsize)]
    
    
    iterate = deque(maxlen=0).extend
    
    
    def iterate_slice():
        for _ in range(10):
            iterate(test_str[slice(1, sys.maxsize)])
    
    
    def iterate_islice():
        for _ in range(10):
            iterate(islice(test_str, 1, sys.maxsize))
    
    
    class WithLen:
        def __init__(self, iterable):
            self.iterable = iterable
        def __iter__(self):
            return iter(self.iterable)
        def __len__(self):
            return len(test_str) - 1
    
    class NoLen:
        def __init__(self, iterable):
            self.iterable = iterable
        def __iter__(self):
            return iter(self.iterable)
    
    
    def extend_slice_len():
        arr = []
        for _ in range(10):
            arr.extend(WithLen(test_str[slice(1, sys.maxsize)]))
    
    
    def extend_slice_nolen():
        arr = []
        for _ in range(10):
            arr.extend(NoLen(test_str[slice(1, sys.maxsize)]))
    
    
    def extend_islice_len():
        arr = []
        for _ in range(10):
            arr.extend(WithLen(islice(test_str, 1, sys.maxsize)))
    
    
    def extend_islice_nolen():
        arr = []
        for _ in range(10):
            arr.extend(NoLen(islice(test_str, 1, sys.maxsize)))
    
    
    funcs = [
        extend_slice,
        extend_islice,
        just_slice,
        iterate_slice,
        iterate_islice,
        extend_slice_len,
        extend_slice_nolen,
        extend_islice_nolen,
        extend_islice_len,
    ]
    
    
    test_str = "".join(choices(ascii_letters, k=10 ** 7))
    
    def print(*a, p=print, f=open('out.txt', 'w')):
        p(*a)
        p(*a, file=f, flush=True)
    
    times = {f: [] for f in funcs}
    def stats(f):
        ts = [t * 1e3 for t in sorted(times[f])[:5]]
        return f'{mean(ts):4.0f} ± {stdev(ts):2.0f} ms '
        return f'{mean(ts):5.1f} ± {stdev(ts):3.1f} ms '
    for _ in range(100):
        print(_)
        for f in random.sample(funcs, len(funcs)):
            t0 = perf_counter()
            f()
            t1 = perf_counter()
            times[f].append(t1 - t0)
    for f in funcs:
        print(stats(f), f.__name__)
    
    print('\nPython:', sys.version)