pythonparallel-processingscipymathematical-optimizationdifferential-evolution

parallel/multithread differential evolution in python


I'm trying to model a biochemical process, and I structured my question as an optimization problem, that I solve using differential_evolution from scipy.
So far, so good, I'm pretty happy with the implementation of a simplified model with 15-19 parameters.
I expanded the model and now, with 32 parameters, is taking way too long. Not totally unexpected, but still an issue, hence the question.

I've seen:
- an almost identical question for R Parallel differential evolution
- and a github issue https://github.com/scipy/scipy/issues/4864 on the topic

but it would like to stay in python (the model is within a python pipeline), and the pull request did not lead to and officially accepted solution yet, although some options have been suggested.

Also, I can't parallelize the code within the function to be optimised because is a series of sequential calculations each requiring the result of the previous step. The ideal option would be to have something that evaluates some individuals in parallel and return them to the population.

Summing up:
- Is there any option within scipy that allows parallelization of differential_evolution that I dumbly overlooked? (Ideal solution)
- Is there a suggestion for an alternative algorithm in scipy that is either (way) faster in serial or possible to parallelize?
- Is there any other good package that offers parallelized differential evolution funtions? Or other applicable optimization methods?
- Sanity check: am I overloading DE with 32 parameter and I need to radically change approach?

PS
I'm a biologist, formal math/statistics isn't really my strenght, any formula-to-english translation would be hugely appreciated :)

PPS
As an extreme option I could try to migrate to R, but I can't code C/C++ or other languages.


Solution

  • Thanks to @jp2011 for pointing to pygmo

    First, worth noting the difference from pygmo 1, since the fist link on google still directs to the older version.

    Second, Multiprocessing island are available only for python 3.4+

    Third, it works. The processes I started when I first asked the question are still running while I write, the pygmo archipelago running an extensive test of all the 18 possible DE variations present in saDE made in less than 3h. The compiled version using Numba as suggested here https://esa.github.io/pagmo2/docs/python/tutorials/coding_udp_simple.html will probably finish even earlier. Chapeau.

    I personally find it a bit less intuitive than the scipy version, given the need to build a new class (vs a signle function in scipy) to define the problem but is probably just a personal preference. Also, the mutation/crossing over parameters are defined less clearly, for someone approaching DE for the first time might be a bit obscure.
    But, since serial DE in scipy just isn't cutting it, welcome pygmo(2).

    Additionally I found a couple other options claiming to parallelize DE. I didn't test them myself, but might be useful to someone stumbling on this question.

    Platypus, focused on multiobjective evolutionary algorithms https://github.com/Project-Platypus/Platypus

    Yabox
    https://github.com/pablormier/yabox

    from Yabox creator a detailed, yet IMHO crystal clear, explaination of DE https://pablormier.github.io/2017/09/05/a-tutorial-on-differential-evolution-with-python/