[SOLVED] Kepler CUDA dynamic parallelism and thread divergence

Kepler CUDA dynamic parallelism and thread divergence

There is very little information on dynamic parallelism of Kepler, from the description of this new technology, does it mean the issue of thread control flow divergence in the same warp is solved?

It allows recursion and lunching kernel from device code, does it mean that control path in different thread can be executed simultaneously?

Solution

Take a look to this paper

Dynamic parallelism, flow divergence and recursion are separated concepts. Dynamic parallelism is the ability to launch threads within a thread. This mean for example you may do this

__global__ void t_father(...)   {
   ...
   t_child<<< BLOCKS, THREADS>>>();
   ...
}

I personally investigated in this area, when you do something like this, when t_father launches the t_child, the whole vga resources are distributed again among those and t_father waits until all the t_child have finished before it can go on (look also this paper Slide 25)

Recursion is available since Fermi and is the ability for a thread to call itself without any other thread/block re-configuration

Regarding the flow divergence, I guess we will never see thread within a warp executing different code simultaneously..