When we have multiple core machines and with Java concurrency, multiple threads across multiple cores is possible. Also, we do have stream in Java which can help distribute the work.
However, how do we ensure that threads are properly distributed across the cores so that we make efficient use of the cores?
How does the thread distribution differ across windows and Linux operating systems? And how does it differ across Intel and AMD processors? Do we need to handle threads in specific ways for different OS and processors?
Typically you would want to match the number of computational threads to the number of hardware threads.
This is the default behaviour of ForkJoinPool
, configurable through other constructor or, for the common pool, the java.util.concurrent.ForkJoinPool.common.parallelism
system property. You may want to lower this if there is other computational intensive work happening on the system. The Stream API doesn't appear to specify behaviour (I may be wrong), but reasonable implementations would use the current ForkJoinPool
(i.e. the pool of the current task is running in, or the common pool if running outside of any pool).
As ever, parallelism may make your program speed go down as well as up.
Scheduling threads is down to the platform, typically delegated to the operating system. Some JRE implementations in the past have moved Java threads between OS threads, but that's about handling I/O blocking.