pythonalgorithmsearchoperators

What algorithm is used when using the in operator in python to search a list?


When using the 'in' operator to search for an item in a list e.g.

if item in list:
  print item

What algorithm is used to search for this item. Is it a straight search of the list from beginning to end or does it use something like binary search?


Solution

  • lists can't be assumed to be in sorted order (or any order at all), so binary search won't work. Nor can the keys be assumed to be hashable, so unlike a dict or set a hash-table lookup can't be used to accelerate the search

    At a guess it's a straight-through check of every element from first to last.

    I'll try and dig up the relevant Python source code.

    --

    Edit: The Python list.__contains__() function, which implements the in operator, is defined in listobject.c:

       393 static int
       394 list_contains(PyListObject *a, PyObject *el)
       395 {
       396     Py_ssize_t i;
       397     int cmp;
       398 
       399     for (i = 0, cmp = 0 ; cmp == 0 && i < Py_SIZE(a); ++i)
       400         cmp = PyObject_RichCompareBool(el, PyList_GET_ITEM(a, i),
       401                                            Py_EQ);
       402     return cmp;
       403 }
    

    It iterates over every element in the list, from the first element to the last element (or until it finds a match.) No shortcuts here.

    --

    Edit 2: The plot thickens. If Python detects that you're testing for membership of an element in a constant list or set, like:

    if letter in ['a','e','i','o','u']:    # list version
    if letter in {'a','e','i','o','u'}:    # set version
    

    Edit 3 [@JohnMachin]:

    The constant list is optimised to a constant tuple in 2.5-2.7 and 3.1-3.3.
    The constant set is optimised to a (constant) frozenset in 3.3.

    See also @CoryCarson's answer.