pythonpython-2.7dictionary-comprehensionsetdefault

Why doesn't setdefault work inside a dictionary comprehension?


Why does setdefault not increment by 1 for every occurrence in a inside a dictionary comprehension, but it does in a loop? What's going on here?

Alternative solutions are great. I'm mostly interested in understanding why this doesn't work.

A loop with setdefault works

a = [1,1,2,2,2,3,3]

b = {}

for x in a:
    b[x] = b.setdefault(x, 0) + 1

b

Out[4]: {1: 2, 2: 3, 3: 2}

A dictionary comprehension with setdefault doesn't work

b = {k: b.setdefault(k, 0) + 1 for k in a}

b

Out[7]: {1: 1, 2: 1, 3: 1}

Update

Thanks for the answers, I wanted to try timing the solutions.

def using_get(a):
    b = {}
    for x in a:
        b[x] = b.get(x, 0) + 1
    return b


def using_setdefault(a):
    b = {}
    for x in a:
        b[x] = b.setdefault(x, 0) + 1
    return b


timeit.timeit(lambda: Counter(a), number=1000000)
Out[3]: 15.19974103783569

timeit.timeit(lambda: using_get(a), number=1000000)
Out[4]: 3.1597984457950474

timeit.timeit(lambda: using_setdefault(a), number=1000000)
Out[5]: 3.231248461129759

Solution

  • There is no dictionary yet inside the dict comprehension. You are building a completely new dictionary, replacing whatever b was bound to before.

    In other words, in your dictionary comprehension, b.setdefault() is a totally different dictionary, it has nothing to do with the object being built by the comprehension.

    In fact, your dictionary comprehension only works if b was bound to an object with a .setdefault() method before you run the expression. If b is not yet defined, or not bound to an object with such a method, it simply fails with an exception:

    >>> a = [1,1,2,2,2,3,3]
    >>> b = {k: b.setdefault(k, 0) + 1 for k in a}
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<stdin>", line 1, in <dictcomp>
    NameError: global name 'b' is not defined
    >>> b = 42
    >>> b = {k: b.setdefault(k, 0) + 1 for k in a}
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<stdin>", line 1, in <dictcomp>
    AttributeError: 'int' object has no attribute 'setdefault'
    

    You cannot do what you want with a dictionary comprehension, unless you group your numbers, which requires sorting and itertools.groupby(); this is not an efficient approach (requiring O(NlogN) steps rather than O(N)):

    >>> from itertools import groupby
    >>> {k: sum(1 for _ in group) for k, group in groupby(sorted(a))}
    {1: 2, 2: 3, 3: 2}
    

    Note that the standard library already comes with a tool to do counting; see the collections.Counter() object:

    >>> from collections import Counter
    >>> Counter(a)
    Counter({2: 3, 1: 2, 3: 2})