c++boostopenclodeintvexcl

Boost - Odeint: What means concurrency using VexCL and how to improve it?


My question is related to the tutorial which explains how to implement boost::odeint with VexCL in order to achieve concurrency (the complete code can be found here).

The following figure shows how I think of the iterations of ODEINT: enter image description here

Now I ask myself, what exactly / or which part of it is parallelised in VexCL?

My impression is, the ODE part is one single task, as all equations of ODE are within one block in the given example. Maybe the integration part runs in three parallel tasks. This results in four tasks, where (I think) the ODE task is a bottle neck (because the equations can become very large).

If this is right I would like to know, how to improve this concurrency. I think it make sense to combine ODE and INT horizontally. This results in 3 tasks, each of which cannot be further reduced at this level.


Solution

  • The example you linked to is doing a parameter study of the Lorenz system. That is, it solves a big number of the same equations with different parameters. The state type is vex::multivector<double,3>, which packs together states (3D coordinates) of many Lorenz systems. This is an embarrassingly parallel problem and one can apply the odeint algorithm to the state types in lock-step. That is, operations like x += tau * dt where x and dt are large vectors, are performed on a GPU.

    More details about odeint/vexcl implementation may be found in [1]. [2] is an interesting paper about how to extract parallelism in the case of coupled systems.

    [1] Ahnert, Karsten, Denis Demidov, and Mario Mulansky. "Solving ordinary differential equations on GPUs." Numerical Computations with GPUs. Springer, Cham, 2014. 125-157. https://doi.org/10.1007/978-3-319-06548-9_7 (pdf)

    [2] Mulansky, Mario. "Optimizing Large-Scale ODE Simulations." arXiv preprint arXiv:1412.0544 (2014).