iosgrand-central-dispatchdispatchsemaphore

Can a Dispatch Semaphore inadvertently deadlock itself?


Say we have a shared resource that a bunch of different global queues have access to and, for the sake of this question, we use a Dispatch Semaphore to manage that access. When one of these global queues tells the semaphore to wait, the semaphore count is decremented and that thread has access to the shared resource. Is it possible that while the semaphore is waiting, another (different) global queue tries to access this shared resource, and the thread that GCD grabbed from its pool is the same thread that was grabbed for the previous queue (the queue that is currently making the semaphore wait) which would deadlock this thread and prevent the semaphore count from ever re-incrementing?


Solution

  • Short answer:

    Yes, using semaphores can result in deadlocks, but not for the reason you suggest.

    Long answer:

    If you have some dispatched task waiting for a semaphore, that worker thread is blocked until the signal is received and it resumes execution and subsequently returns. As such, you don’t have to worry about another dispatched task trying to use the same thread, because that thread is temporarily removed from the thread pool. You never have to worry about two dispatched tasks trying to use the same thread at the same time. That is not the deadlock risk.

    That having been said, we have to be sensitive to the fact that the number of worker threads in the thread pool is extremely limited (currently 64 per QoS). If you exhaust the available worker threads, then anything else dispatched to GCD (with the same QoS) cannot run until some of those previously blocked worker threads are made available again.

    Consider:

    print("start")
    
    let semaphore = DispatchSemaphore(value: 0)
    let queue = DispatchQueue.global()
    let group = DispatchGroup()
    let count = 10
    
    for _ in 0 ..< count {
        queue.async(group: group) {
            semaphore.wait()
        }
    }
    
    for _ in 0 ..< count {
        queue.async(group: group) {
            semaphore.signal()
        }
    }
    
    group.notify(queue: .main) {
        print("done")
    }
    

    That works fine. You have ten worker threads tied up with those wait calls and then the additional ten dispatched blocks call signal, and you’re fine.

    But, if you increase count to 100 (a condition referred to as “thread explosion”), the above code will never resolve itself because the signal calls are waiting for worker threads that are tied up with all of those wait calls. None of those dispatched tasks with signal calls will ever get a chance to run. And, when you exhaust the worker threads, that is generally a catastrophic problem because anything trying to use GCD (for that same QoS) will not be able to run.


    By the way, the use of semaphores in the thread explosion scenario is just one particular way to cause a deadlock. But for the sake of completeness, it’s worth noting that there are lots of ways to deadlock with semaphores. The most common example is where a semaphore (or dispatch group or whatever) is used to wait for some asynchronous process, e.g.

    let semaphore = DispatchSemaphore(value: 0)
    someAsynchronousMethod {
        // do something useful
    
        semaphore.signal()
    }
    semaphore.wait()
    

    That can deadlock if (a) you run that from the main queue; but (b) the asynchronous method happens to call its completion handler on the main queue, too. This is the prototypical semaphore deadlock.

    I only used the thread-explosion example above because the deadlock is not entirely obvious. But clearly there are lots of ways to cause deadlocks with semaphores.