How does Python ThreadPoolExecutor switch between concurrent threads?
In the case of the async/awaint event-loop, the switching between different pieces of the code happens at the await calls. Does the ThreadPoolExecutor run each submitted task for a random amount of time> Or until something somewhere calls Thread.sleep()? Or the OS temporarily switches to do something else and the thread is forced to release the GIL thus allowing some other thread to grab the GIL next time?
Does the ThreadPoolExecutor run each submitted task for a random amount of time?
No. The executor runs a thread pool with a work queue. When you add a new task, a thread will pick it up and run it to completion. Individual threads do not switch tasks before the previous task has completed.
As long as the global interpreter lock (GIL) is still around, only one thread can execute Python bytecode at any given moment (within one process; not counting multiple sub-interpreters). Since Python-3.2 the GIL switches threads every 5 ms (configurable) as opposed to every 100 instructions like it was before. For comparison, the Linux kernel typically runs its scheduler with time slices between 0.75 and 6 ms.
Or until something somewhere calls Thread.sleep()?
Every system call that may block / sleep and every call into compiled code that may run for extended periods of time is typically written to release the GIL.
This works in addition to the 5 ms time slice and is the intended work regime for thread pools. If the background threads spend most of their time with the GIL released waiting on IO or expensive library calls (such as numpy), they parallelize effectively. If they mostly execute Python bytecode, they just compete with the main event loop for the GIL. That can still work fine if the event loop is mostly idle but it isn't exactly best-practice and it doesn't scale well.
Or the OS temporarily switches to do something else and the thread is forced to release the GIL thus allowing some other thread to grab the GIL next time?
The OS doesn't know the GIL. If the OS interrupts the thread currently holding the GIL, it will continue to hold it. No other thread can acquire it during that time. Once the thread wakes back up it will likely have exceeded its 5 ms time slice and will release the GIL at the next opportunity.