I've been trying to understand built-in view objects return by .items()
, .values()
, .keys()
in Python 3 or similarly by .viewitems()
, .viewvalues()
, .viewkeys()
. There are other threads on that subject but none (even the doc) seems to described how they work internally.
The main gain here seems to be efficienty compared to the copy of type list
returned in Python 2. There are often compared to a window to the dictionnary items (like in this thread).
But what is that window and why is it more efficient ?
The only thing I can see is that the view objects seems to be set-like objects, which are generally faster for membership testing. But is this the only factor ?
Code sample
>>> example_dict = {'test':'test'}
>>> example_dict.items()
dict_items([('test', 'test')])
>>> type(example_dict.items())
<class 'dict_items'>
So, my question is regarding this dict_items
class. How does that work internally?
Dict views store a reference to their parent dict, and they translate operations on the view to corresponding operations on the dict.
Iteration over a dict view is more efficient than building a list and iterating over that, because building a list takes time and memory that you don't have to spend with the view. The old way, Python would iterate over the dict's underlying storage to build a new list, and then you would iterate over the list. Iterating over a dict view uses an iterator that walks through the dict's underlying storage directly, skipping the unnecessary list step.
Dict views also support efficient containment tests and setlike intersection/difference/etc. operations, because they get to perform direct hash lookups on the underlying dict instead of iterating through a list and checking equality element by element.
If you want to see the concrete implementation used by CPython, you can take a look in the official repository, but this implementation is subject to change. It has changed, repeatedly.