flutterdartdart-isolates

Multiple Isolates vs one Isolate


How isolates are distributed across CPU cores

In Dart, you can run multiple isolates at the same time, and I haven't been able to find a guideline or best practice for using isolates.

My question is how will overall CPU usage and performance be affected by the numbers of isolates running at the same time, and is it better to use a small number of isolates (or even just one) or not.


Solution

  • One isolate per one thread

    One isolate takes one platform thread - you can observe threads created per each isolate in the Call Stack pane of VSCode when debugging the Dart/Flutter app with multiple isolates. If the workload of interest allows parallelism you can get great performance gains via isolates.

    Note that Dart explicitly abstracts away the implementation detail and docs avoid the specifics of scheduling of isolates and their intrinsics.

    Number of isolates = ±number of CPU core

    In determining the number of isolates/threads as the rule of thumb you can take the number of cores as the initial value. You can import 'dart:io'; and use the Platform.numberOfProcessors property to determine the number of cores. Though to fine tune experimentation would be required to see which number makes more sense. There're many factors that can influence the optimal number of threads:

    1. Presence of Simultaneous MultiThreading (SMT) in CPU, such as Intel HyperThreading
    2. Instruction level parallelism (ILP) and specific machine code produced for your code
    3. CPU architecture
    4. Mobile/smartphone scenarios vs desktop - e.g. Intel CPUs have the same cores, less tendency to throttling. Smartphones have efficiency and high-performance cores, they are prone to trotling, creating a myriad of threads can lead to worse results due to OS slowing down your code.

    E.g. for one of my Flutter apps which uses multiple isolates to parallelize file processing I empirically came to the following piece of code determining the number of isolates to be created:

    var numberOfIsolates = max(Platform.numberOfProcessors - 2, 2)
    

    Isolate is not a thread

    The model offered by isolate is way more restricting than what the standard threaded model suggests.

    Isolates do not share memory vs Threads can read each other's vars. There're technical exceptions, e.g. since around Flutter 2.5.0 isolates use one heap, there're exceptions for immutable types sharing across isolates, such as strings - though they are an implementation detail and don't change the concept.

    Isolates communicate only via messages vs numerous synchronizations prymitives in threads (critical sections, locks, semaphores, mutexes etc.).

    The clear tradeoff is that Isolates are not prone to multi-threaded programming horrors (tricky bugs, debugging, development complexity) yet provide fewer capabilities for implementing parallelism.

    In Dart/Flutter there're only 2 ways to work with Isolates:

    1. Low level, Dart style - using the Isolate class to spawn individual isolates, set-up send/receive ports for messaging, code entry points.
    2. Higher level Compute helper function in Flutter - it get's input params, creates a new isolate with defined entry point, processes the inputs and prives a single result - not back and forth communication, streams of events etc., request-response pattern.

    Note that in Dart/Flutter SDK there is no parallelism APIs such as Task Parallel Library (TPL) in .NET which provides multi-core CPU optimized APIs to process data on multiple threads, e.g. sorting a collection in parallel. A huge number of algorithms can benefit from parallelism using threads though are not feasable with Isolates model where there's no shared memory. Also there's no Isolate pool, a set of isolates up and running and waiting for incoming tasks (I had to create one by myself https://pub.dev/packages/isolate_pool_2).

    P.S.: the influence of SMT, ILP and other stuff on the performance of multiple treads can be observed via the following CPU benchmark (https://play.google.com/store/apps/details?id=xcom.saplin.xOPS) - e.g. one can see that there's typically a sweet spot in terms of a number of multiple threads doing computations. It is greater than the number of cores. E.g. on my Intel i7 8th gen MacBook with 6 cores and 12 threads per CPU the best performance was observed with the number of threads at about 4 times the number of cores.