I've got a large dictionary from which I have to look up for values a lot of times. My keys are integers but represent labels so do not need to be added, subtracted, etc... I ended up trying to assess access time between string key and integer key dictionary and here is the result.
from timeit import Timer
Dint = dict()
Dstr = dict()
for i in range(10000):
Dint[i] = i
Dstr[str(i)] = i
print 'string key in Dint',
print(Timer("'7498' in Dint", "from __main__ import Dint").timeit(100000000))
print 'int key in Dint',
print(Timer("7498 in Dint", "from __main__ import Dint").timeit(100000000))
print 'string key in Dstr',
print(Timer("'7498' in Dstr", "from __main__ import Dstr").timeit(100000000))
print 'int key in Dstr',
print(Timer("7498 in Dstr", "from __main__ import Dstr").timeit(100000000))
which produces slight variations between runs reproduced each time :
string key in Dint 4.5552944017
int key in Dint 7.14334390267
string key in Dstr 6.69923791116
int key in Dstr 5.03503126455
Does it prove that using dictionary with strings as keys is faster to access than with integers as keys?
CPython's dict
implementation is in fact optimized for string key lookups. There are two different functions, lookdict
and lookdict_string
(lookdict_unicode
in Python 3), which can be used to perform lookups. Python will use the string-optimized version until a search for non-string data, after which the more general function is used. You can look at the actual implementation by downloading CPython's source and reading through dictobject.c
.
As a result of this optimization, lookups are faster when a dict
has all string keys.