Below is a sample code that demonstrates the use of scipy.optimize.differential_evolution, which I had requested from ChatGPT. The output indicates the code terminated when the convergence exceeded one, with a message stating "Optimization terminated successfully".
import numpy as np
from scipy.optimize import differential_evolution
# Define a sample function to be optimized (Rastrigin function in this case)
def rastrigin(x):
return sum([(i**2 + 10 - 10*np.cos(2*np.pi*i)) for i in x])
# Callback function to print the convergence value at each iteration
def callback(xk, convergence):
print(f"Current parameters: {xk}, Convergence: {convergence:.6f}")
bounds = [(-5.12, 5.12), (-5.12, 5.12)] # For a 2D Rastrigin function
result = differential_evolution(rastrigin, bounds, callback=callback)
print("\nOptimized Result:")
print(result)
Current parameters: [-0.05500736 1.12167317], Convergence: 0.019343
Current parameters: [-0.05500736 1.12167317], Convergence: 0.021779
Current parameters: [-0.05500736 1.12167317], Convergence: 0.023104
Current parameters: [-1.0372644 0.95886127], Convergence: 0.021842
Current parameters: [-1.0372644 0.95886127], Convergence: 0.022447
Current parameters: [-1.0372644 0.95886127], Convergence: 0.020804
Current parameters: [-1.0372644 0.95886127], Convergence: 0.019910
Current parameters: [-1.0372644 0.95886127], Convergence: 0.020295
Current parameters: [-0.92414087 -0.03163365], Convergence: 0.019972
Current parameters: [-0.92414087 -0.03163365], Convergence: 0.018159
Current parameters: [-0.92414087 -0.03163365], Convergence: 0.019535
Current parameters: [-1.01618653 -0.01727175], Convergence: 0.016007
Current parameters: [-1.01618653 -0.01727175], Convergence: 0.017456
Current parameters: [-1.01618653 -0.01727175], Convergence: 0.015801
Current parameters: [-0.98535569 0.02419573], Convergence: 0.014148
Current parameters: [-0.9894422 -0.00648482], Convergence: 0.018350
Current parameters: [-0.9894422 -0.00648482], Convergence: 0.015497
Current parameters: [-0.9894422 -0.00648482], Convergence: 0.050019
Current parameters: [-0.99360956 -0.00208593], Convergence: 0.172460
Current parameters: [-9.93609564e-01 -2.49866280e-04], Convergence: 0.289696
Current parameters: [-9.93609564e-01 -2.49866280e-04], Convergence: 0.352541
Current parameters: [-9.94934163e-01 -7.51133414e-05], Convergence: 1.135028
Optimized Result:
message: Optimization terminated successfully.
success: True
fun: 0.9949590570932987
x: [-9.950e-01 -4.988e-09]
nit: 22
nfev: 702
jac: [ 3.553e-07 0.000e+00]
I anticipated that successful optimization would correspond with the convergence nearing zero. Consequently, I'm curious about how "convergence" is defined in this context. Unfortunately, I couldn't discern this from the official documentation: https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.differential_evolution.html
If you search for the callback
signature in the docu you linked, it tells you what to expect:
val
represents the fractional value of the population convergence. Whenval
is greater than one the function halts.
Not a full explanation. But at least the fact that it stops at one is no surprise now.
The people who wrote the library can expect that you're going to look up the callback signature before implementing the callback, to know what arguments it takes. So they can expect that you would have read this section before running the code.
The same docu also has a link to the source at the top. (Very convenient. But if this wasn't there, you could still download their github repo and grep around, or get to the actual source file from your IDE or REPL.) With a bit of searching you can find where the callback gets called:
c = self.tol / (self.convergence + _MACHEPS)
warning_flag = bool(self.callback(self.x, convergence=c))
So it is a ratio tol / convergence
.
If you read the docu again, you can see that tol
is a parameter and explained like this:
Relative tolerance for convergence, the solving stops when np.std(pop) <= atol + tol * np.abs(np.mean(population_energies)), where and atol and tol are the absolute and relative tolerance respectively.
You could read the source further to find how convergence
is calculated. But the gist is that what you've printed is a ratio, comparing a tolerance to how much variance there still is in the population.