I know that python has a GIL that make threads not be able to run at the same time therefore threading is just context switching.
Why is java different? Threads on the same CPU in every language cannot run parallel.
Is creating new thread in java utilizes cores in multi core machine?
python can only spawn threads on the same CPU, in contrast to java?
If 1. Is the case, when using more threads than CPUs even in java it comes back to context switching again for several of them?
If 1. Is the case then how is it differ from multiprocessing? Because utilizing multiple cores isn't guaranteed?
Isn't the whole point of threading is being able to use the same memory space? If java does run some of them in multiple threads for perallelism, how do they really share memory?
Thank you
Why is java different?
Because it is able to effectively use multiple cores at the same time.
- Does creating a new thread in java utilizes cores in multi core machine?
Yes.
- Python can only spawn threads on the same CPU, in contrast to Java?
Java can spawn multiple threads which will on different CPUs. Java is not responsible for the actual thread scheduling. That is handled by the OS. And the OS may reschedule a thread to a different CPU to the one that it started on.
I am not sure about the precise details for Python, but I think the GIL is an implementation detail rather than something that it intrinsic to the language itself1. But either way, in a Python implementation, the GIL means that you would get little performance benefit in spawning threads on multiple cores. As this page says:
"The Python Global Interpreter Lock or GIL, in simple words, is a mutex (or a lock) that allows only one thread to hold the control of the Python interpreter."
- If 1. is the case, when using more threads than CPUs does it come back to context switching in Java?
It depends. When switching a CPU between threads belonging to different processes, a full context switch is involved. But when switching between threads in the same process, only the (user) registers need to be switched. (The virtual memory registers and caches don't need to be switched / flushed because the threads share the same virtual address space.)
- If 1. is the case then how is it differ from multiprocessing? Because utilizing multiple cores isn't guaranteed?
The key difference between multi-threading and multi-processing is that processes do not share any memory. By contrast, one thread in a process can see the memory of all of the others ... modulo issues of when changes are visible.
This difference has a variety of consequences.
- Isn't the whole point of threading is being able to use the same memory space?
Yes, that is the main point ... when you compare multi-threading with multi-processing.
If Java does run some of them in multiple threads for parallelism ...
Java supports threads for many reasons. Parallelism is only one of those reasons. Others include multiplexing I/O and simplifying certain kinds of programming problem. These other reasons are also relevant to Python.
... how do [Java threads] really share memory?
The hardware deals with the issues of making the physical memory visible to all of the threads, and propagation of changes via the memory caches. It is complicated.
In Java the onus is on the programmer to "do the right thing" when threads make use of shared variables / objects. You need to use volatile
variables, or synchronized
blocks / methods, or something else that ensures that there is a happens before chain between a write and subsequent read. (Otherwise you can get issues with changes not being visible.)
This transfer of responsibility to the programmer allows the compiler to generate code with fewer main memory operations ... and hence that is faster. The downside is that if an application doesn't obey the rules, it is liable to behave in unexpected ways.
By contrast, in Python the memory model is unspecified, but there is an expectation (by millions of Python programmers) that it will behave in an intuitive fashion; i.e. a shared variable write performed by one thread will immediately be visible to other threads. This is hard to achieve efficiently while also allowing Python threads to run in parallel.
1 - While the GIL is not formally part of the Python spec, the influence of GIL on the (unspecified!) Python memory model and Python programmers assumptions make it more than merely an implementation detail. It remains to be seen if Python can successfully evolve into a language where multi-threading can use multiple cores effectively.