pythonlistmultithreadingqueuegil

Is Python's `list.clear()` thread-safe?


In Python, suppose one thread is appending/popping items to/from a list/collections.deque/similar built-in container, while another thread occasionally empties the container via its clear() method. Is this interaction thread-safe? Or is it possible for the clear() to interfere with a concurrent append()/pop() operation, leaving the list uncleared or corrupted?

My interpretation of the accepted answer here suggests that the GIL should prevent such interference, at least for lists. Am I correct?

As a related follow-up, if this is not thread-safe, I suppose I should use a queue.Queue instead. But what is the best (i.e., cleanest, safest, fastest) way to clear it from the second thread? See the comments on this answer for concerns about using the (undocumented) queue.Queue().queue.clear() method. Need I really use a loop to get() all the items one by one?


Solution

  • update Methods like list.clear are not atomic in the sense that other threads can add elements to the list (or other container) before the method returns to the current code. They are "thread safe" in the sense that they won't ever be in an inconsistent state that will cause an exception - but not "atomic" .

    In other words: the list object will never be "broken" with or without the use of a Lock - but whatever will be inside it is not deterministic.

    The following snippet inserts data before list.clear() returns both in the same thread, and from other thread:

    import threading, time
    
    class A:
        def __init__(self, container, delay=0.2):
            self.container = container
            self.delay = delay
        def __del__(self):
            time.sleep(self.delay)
            self.container.append("thing")
    
    def doit():
        target = []
        def interferer():
            time.sleep(0.1)
            target.append("this tries to be first")
        target.append(A(target, delay=0.3))
        t = threading.Thread(target=interferer)
        t.start()
        target.clear()
        return target
    
    In [37]: doit()
    Out[37]: ['this tries to be first', 'thing']
    
    
    

    So, if one needs a "thread-safe" and "atomic" sequence - it has to be crafted from collections.abc.MutableSequence and the appropriate locks in the methods that perform mutations.

    original answer

    As put in the comments: all operations on built-in data structures are thread safe in Python - what have ensured this up to this day is the GIL (global interpreter lock), which otherwise penalizes multi-threading code in Python.

    For Python3.13 onwards, there will be the option of running Python code without the GIL, but it is a language guarantee that such operations on built-in data structures will remain thread-safe, by the use of finer grained locking - check the Container Thread Safety session on PEP 703 (as it not only explains the mechanism forward, as re-asserts the current status quo of these modifications being effectively atomic thead safe, though not "atomic")

    However, depending on the code you have, you may wish to express the list modification with another operation instead os a method calling - since some methods can't be atomic. The linked session on PEP 703 above gives the example of list.extend, which, if used with a generator object simply can't be atomic. So to lessen the chances of someone changing your code in the future, clearing the list can be expressed with mylist[:] = () - I have the feeling one would think twice before replacing this with a method call which could lead to undesired race conditions.