python math optimization scipy beta-distribution

Intuition behind a minimizer

I'm very new in quantitative and scientifical programming and I've run into the minimizer function of scipy scipy.optimize.fmin. Can someone explain me the basic intuition of this function for a non-engineering student?

Letz say I want to minimize following function:

def f(x): x**2

1) What does the minimizer actually minimize? The dependent or independent variable?

2) What's the difference between scipy.optimize.fmin and scipy.optimize.minimize?

Solution

Given a function which contains some unknown parameters (so actually it is a family of functions) and data, a minimizer tries to find parameters which minimize the distance of the function values to data. This is done colloquially speaking by iteratively adjusting the parameters until further change does not seem to improve the result.

This is the equivalent to the ball running down a hill, mentioned by @pylang in the comment. The "hill" is the distance to the data, given all possible parameter values. The rolling ball is the minimizer which "moves" over that landscape, trying out parameters until it is in a position where every move would lead to increasing distance to the data or at least no notable decrease.

Note however, that by this method you are searching for a local minimum of the function values to the data, given a set of parameters to the function. For a simple function like you posted, the local minimum is the only one and therefore the global one, but for complex functions involving many parameters this problem quickly can get quite tricky.

People then often use multiple runs of the minimizer to see if it stops at the same positions. If that is not the case, people say the mimimizer fails to converge, which means the function is too complex that one minimum is easily found. There are a lot of algorithms to counter that, simulated annealing or Monte Carlo methods come to my mind.

To your function: The function f which is mentioned in the example in the help of the fmin function is the distance function. It tells you how far a certain set of parameters puts you with respect to your target. Now you must define what distance means for you. Commonly, the sum of squared residuals (also called euclidean norm) is used:

sum((function values - data points)^2)

Say you have a function

def f(x, a, b): return a*x**2 + b

You want to find values for a and b such that your function comes as closely as possible to the data points given below with their respective x and y values:

datax = [ 0, 1, 2, 3, 4]
datay = [ 2, 3, 5, 9, 15]

Then if you use the euclidean norm, your distance function is (this is the function f in the fmin help)

def dist(params):
    a, b = params
    return sum((f(x,a,b) - y)**2 for x,y in zip(datax, datay))

You should be able (sorry, I have no scipy on my current machine, will test it tonight) to minimize to get fitting values of a and b using

 import scipy.optimize
 res = scipy.optimize.fmin(dist, x0 = (0,0))

Note that you need starting values x0 for your parameters a and b. These are the values which you choose randomly if you run the minimizer multiple times to see whether it converges.