To make my modeling code neater I've been using namedtuples to manage model parameters. I would like to use SciPy's parallelized implementation of differential evolution to fit my model to data, but I can only get it to work in series.
The documentation for differential_evolution
stipulates that the objective function must be "pickleable" for parallel optimization. Using namedtuples in the objective function arguments seems to violate this requirement. Is there a workaround that doesn't involve completely rewriting how my modeling code handles parameters?
A simplified example is below.
code:
from collections import namedtuple
from scipy.optimize import differential_evolution
def rosenbrock(x, par):
"""Rosenbrock function for testing optimization algorithms"""
return (par.a - x[0])**2 + par.b*(x[1] - x[0]**2)**2
if __name__ == '__main__':
# Define a namedtuple generator object for creating model parameter namedtuples.
parameters_nt = namedtuple('parameters', 'a b')
# Create a model parameter namedtuple with a=2 and b=3 (global minimum at [2, 4]).
par01 = parameters_nt(2, 3)
# Define optimization bounds.
bounds = [(0, 10), (0, 10)]
# Attempt to optimize in series.
series_result = differential_evolution(rosenbrock, bounds, args=(par01,))
print(series_result.x)
# Attempt to optimize in parallel.
parallel_result = differential_evolution(rosenbrock, bounds, args=(par01,),
updating='deferred', workers=-1)
print(parallel_result.x)
program output:
[2. 4.]
Traceback (most recent call last):
File "parallel_test.py", line 23, in <module>
parallel_result = differential_evolution(rosenbrock, bounds, args=(par01,), updating='deferred', workers=-1)
File "/home/jack/miniconda3/lib/python3.7/site-packages/scipy/optimize/_differentialevolution.py", line 276, in differential_evolution
ret = solver.solve()
File "/home/jack/miniconda3/lib/python3.7/site-packages/scipy/optimize/_differentialevolution.py", line 688, in solve
self.population)
File "/home/jack/miniconda3/lib/python3.7/site-packages/scipy/optimize/_differentialevolution.py", line 789, in _calculate_population_energies
parameters_pop[0:nfevs]))
File "/home/jack/miniconda3/lib/python3.7/site-packages/scipy/_lib/_util.py", line 412, in __call__
return self._mapfunc(func, iterable)
File "/home/jack/miniconda3/lib/python3.7/multiprocessing/pool.py", line 268, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/home/jack/miniconda3/lib/python3.7/multiprocessing/pool.py", line 657, in get
raise self._value
File "/home/jack/miniconda3/lib/python3.7/multiprocessing/pool.py", line 431, in _handle_tasks
put(task)
File "/home/jack/miniconda3/lib/python3.7/multiprocessing/connection.py", line 206, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/home/jack/miniconda3/lib/python3.7/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <class '__main__.parameters'>: attribute lookup parameters on __main__ failed
I modified my code so that the objective function takes arguments as a dictionary and then converts that dictionary to a namedtuple.
code
from collections import namedtuple
from scipy.optimize import differential_evolution
def rosenbrock(x, par):
"""Rosenbrock function for testing optimization algorithms"""
# Convert parameter dictionary to namedtuple.
par = convert_par_type(par)
return (par.a - x[0])**2 + par.b*(x[1] - x[0]**2)**2
def convert_par_type(par):
"""converts a parameter namedtuple to a dictionary and vice versa"""
if type(par)==parameters_nt:
par = dict(par._asdict())
elif type(par)==dict:
par = parameters_nt(**par)
else:
raise TypeError
return par
if __name__ == '__main__':
# Define a namedtuple factory object for generating model parameter namedtuples.
parameters_nt = namedtuple('parameters', 'a b')
# Create a model parameter namedtuple with a=2 and b=3 (global minimum at [2, 4]).
par01 = parameters_nt(2, 3)
# Convert model parameter namedtuple to dictionary.
par02 = convert_par_type(par01)
# Define optimization bounds.
bounds = [(0, 10), (0, 10)]
# Attempt to optimize in series.
series_result = differential_evolution(rosenbrock, bounds, args=(par02,))
print(series_result.x)
# Attempt to optimize in parallel.
parallel_result = differential_evolution(rosenbrock, bounds, args=(par02,),
updating='deferred', workers=-1)
print(parallel_result.x)
output
[2. 4.]
[2. 4.]