I have a curious case. Unfortunately, I did not manage to reproduce the computation times in the MWE below but my project is basically a more crowded version. I wanted to speedup my code so I tried out differential_evolution
with worker=-1
. This did not work because of pickling issues (AttributeError: Can't pickle local object
); I don't really want to define my cost-function at the top level of my module. So I used worker=Pool(ncpu).map
instead. In my local MWE this works, but when migrating this change to my project I observe the following.
I observe that using two different approaches I find that differential_evolution
takes more then 10 times as much time to finish in the second (class-based) approach then the first approach. When I look into my CPU load for approach 1 I see that all cores are used. However in approach 2 basically only one core is being used. When I look at the output via disp=True
I also see that the func
evaluations are printed way slower as well in the 2nd approach compared to the 1st.
Any idea what that might be? Tx.
from os import cpu_count
from time import time
import numpy as np
from pathos.multiprocessing import ProcessingPool as Pool
from scipy.optimize import differential_evolution
from scipy.integrate import quad
from scipy.interpolate import CubicSpline
atol=1e-13
rtol=1e-10
T = 10
bounds = [(-T,T)]
# --- Approach 1 --- #
x = np.linspace(0,10*2*np.pi,100)
y = np.cos(x)
cs = CubicSpline(x, y)
def f(t):
return cs(t)**2
ctime = time()
def cost(t):
t = t[0]
q = 0
for _ in range(50):
q += quad(lambda s: f(s*t), -T, T)[0]
return q
result = differential_evolution(cost, bounds, atol=atol, tol=rtol, updating='deferred', workers=Pool(cpu_count()).map)
print("Approach 1:", time()-ctime)
# --- Approach 2 --- #
class A():
def define_f(self):
x = np.linspace(0,10*2*np.pi,100)
y = np.cos(x)
cs = CubicSpline(x, y)
def f(t):
return cs(t)**2
return f
f = A().define_f()
class B():
def solve(self, f):
def cost(t):
t = t[0]
q = 0
for _ in range(50):
q += quad(lambda s: f(s*t), -T, T)[0]
return q
ctime = time()
result = differential_evolution(cost, bounds, atol=atol, tol=rtol, updating='deferred', workers=Pool(cpu_count()).map)
print("Approach 2:", time()-ctime)
B().solve(f=f)
Produces something like:
Approach 1: 4.710453510284424
Approach 2: 5.638171672821045
However in my project the second timestamp is an order of magnitude larger.
Below is a vectorized version of the code I'm actually using without the provided pool.map
: y0,y1
are solutions obtained from solving the variational equation of a 2-dimension nonlinear system for periodic solutions via scipy.integrat.solve_ivp
. The first step in cost
of reshaping the t
array is necessary since scipy's solution object requires the shape to be (n,)
.
# Define fundamental solution.
def W(t: np.ndarray) -> np.ndarray:
if type(t) is np.ndarray: return np.stack([y0.sol(t).T, y1.sol(t).T], axis=-1).swapaxes(1,2)
else: return np.array([y0.sol(t), y1.sol(t)])
# Define monodromy matrix and related quantities.
C = W(period)
C1 = np.eye(C.shape[0]) - C
C2 = np.linalg.solve(C, C1)
# Define cost function.
def cost(t):
if len(t.shape) > 1:
t = np.reshape(t, (t.shape[1],))
B = W(t)
def H(s: float, t: np.ndarray=t, B: np.ndarray=B):
mask_s_le_t = s <= t
W_s = W(s)
X0 = np.linalg.solve(C1 @ W_s, B)
X1 = np.linalg.solve(C2 @ W_s, B)
return np.where(mask_s_le_t[:, None, None], X0, X1)
return -quad_vec(lambda s: np.sum(H(s)**2, axis=(1,2)), 0, period)[0]
# Maximize.
maxi = -minimize(cost, bounds=[(0,period)], atol=atol, tol=rtol, updating='deferred', vectorized=1).fun
I have a curious case. Unfortunately, I did not manage to reproduce the computation times in the MWE below but my project is basically a more crowded version. I wanted to speedup my code so I tried out
differential_evolution
withworker=-1
. This did not work because of pickling issues (AttributeError: Can't pickle local object
); I don't really want to define my cost-function at the top level of my module. So I usedworker=Pool(ncpu).map
instead.
I have personally found that the multiprocessing support built into the standard library works better than pathos. I would suggest solving this problem by making cost()
picklable.
You mention that you don't want to define your cost function at the module top level, but this is not the only way to approach the problem. If you have state that you want to pass along within a function that you want to pickle, you have three options for how to solve this: functools.partial()
, SciPy args
, and defining methods on a class. (Technically, you can also do this by defining the variable as a global, as you do in Approach #1, but as this is not portable, I don't include it.)
Of these three approaches, defining methods on a class can make this picklable without defining your cost function at top level.
Example:
from os import cpu_count
from time import time
import numpy as np
from multiprocessing import Pool
from scipy.optimize import differential_evolution
from scipy.integrate import quad
from scipy.interpolate import CubicSpline
atol=1e-13
rtol=1e-10
T = 10
bounds = [(-T,T)]
# --- Approach 3 --- #
class A():
def f(self, t):
return self.cs(t)**2
def define_f(self):
x = np.linspace(0,10*2*np.pi,100)
y = np.cos(x)
cs = CubicSpline(x, y)
self.cs = cs
return self.f
class B():
def cost(self, t):
t = t[0]
q = 0
for _ in range(50):
q += quad(lambda s: self.f(s*t), -T, T)[0]
return q
def solve(self, f):
ctime = time()
self.f = f
with Pool(cpu_count()) as pool:
result = differential_evolution(self.cost, bounds, atol=atol, tol=rtol, updating='deferred', workers=pool.map)
print("Approach 3:", time()-ctime)
if __name__ == '__main__':
f = A().define_f()
B().solve(f=f)
On my system, this is actually faster than approach 1:
Approach 1: 11.039157629013062
Approach 2: 11.278709650039673
Approach 3: 10.805178165435791
Note that workers=multiprocessing.Pool(cpu_count())
is roughly equivalent to workers=-1
, since differential_evolution()
uses the built in multiprocessing support by default.